Bagged trees

Machine Learning with Tree-Based Models in R

Sandro Raabe

Data Scientist

Many heads are better than one

the wisdom of the crowd

Machine Learning with Tree-Based Models in R

Bootstrap & aggregation

  • Bagging = short for Bootstrap Aggregation

 

 1. Bootstrapping

  • Sampling with replacement $\rightarrow$ many modified training sets

 

 2. Aggregation

  • Predictions of different models are aggregated for final prediction:
    • Average (in regression)
    • Majority vote (in classification)
Machine Learning with Tree-Based Models in R

Step 1: Bootstrap and train

bootstrapping scheme

Machine Learning with Tree-Based Models in R

Step 2: Aggregate

aggregate results

Machine Learning with Tree-Based Models in R

Coding: Specify the bagged trees

library(baguette)
spec_bagged <- bag_tree() %>%

set_mode("classification") %>%
set_engine("rpart", times = 100)
Bagged Decision Tree Model Specification (classification)

Main Arguments:
  cost_complexity = 0
  min_n = 2

Engine-Specific Arguments:
  times = 100

Computational engine: rpart
Machine Learning with Tree-Based Models in R

Train all trees

model_bagged <- fit(spec_bagged, formula = still_customer ~ ., data = customers_train)
parsnip model object

Fit time:  23.9s

Bagged CART (classification with 100 members)
Variable importance scores include: # A tibble: 19 x 4 term value std.error used <chr> <dbl> <dbl> <int> 1 total_trans_ct 876. 3.93 100 2 total_trans_amt 800. 4.54 100 3 total_revolving_bal 491. 3.67 100
Machine Learning with Tree-Based Models in R

Let's bootstrap!

Machine Learning with Tree-Based Models in R

Preparing Video For Download...