Machine Learning with Tree-Based Models in R
Sandro Raabe
Data Scientist
ranger
, randomForest
tidymodels
interface to these packages: rand_forest()
(contained in parsnip
package)rand_forest()
Hyperparameters:
mtry
: predictors seen at each node, default:trees
: number of trees in the forest min_n
: smallest node size allowedrand_forest(
mtry = 4,
trees = 500,
min_n = 10) %>%
# Set the mode set_mode("classification") %>%
# Use engine ranger or randomForest set_engine("ranger")
spec <- rand_forest(trees = 100) %>%
set_mode("classification") %>%
set_engine("ranger")
Random Forest Model Specification
(classification)
Main Arguments: trees = 100
Computational engine: ranger
spec %>% fit(still_customer ~ ., data = customers_train)
parsnip model object
Fit time: 631ms
Ranger result
Number of trees: 100
Sample size: 9116
Number of independent variables: 19
Mtry: 4
Target node size: 10
rand_forest(mode = "classification") %>% set_engine("ranger", importance = "impurity") %>%
fit(still_customer ~ ., data = customers_train) %>%
vip::vip()
Machine Learning with Tree-Based Models in R