Klasifikasi dengan Random Forest

Machine Learning di Tidyverse

Dmitriy (Dima) Gorenshteyn

Lead Data Scientist, Memorial Sloan Kettering Cancer Center

ranger() untuk Klasifikasi

cv_tune <- cv_data %>%
  crossing(mtry = c(2, 4, 8, 16)) 

cv_models_rf <- cv_tune %>% 
  mutate(model = map2(train, mtry, ~ranger(formula = Attrition~., 
                                           data = .x, mtry = .y,
                                           num.trees = 100, seed = 42)))
Machine Learning di Tidyverse

1) Siapkan Kelas Aktual

attrition kelas
Yes TRUE
No FALSE
validate$Attrition
No  No  No  No  No  Yes No  Yes ... No  No  No
validate_actual <- validate$Attrition == "Yes"
validate_actual 
FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE ... FALSE FALSE FALSE
Machine Learning di Tidyverse

2) Siapkan Kelas Prediksi

P(attrition) kelas
Yes TRUE
No FALSE
validate_classes <- predict(rf_model, rf_validate)$predictions
validate_classes
No  No  No  No  No  Yes No  No ... No  No  No
validate_predicted <- validate_classes == "Yes"
validate_predicted
FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE ... FALSE FALSE FALSE
Machine Learning di Tidyverse

Bangun Model Attrition Terbaik

Machine Learning di Tidyverse

Preparing Video For Download...