Bagging parameters: tips and tricks

Ensemble Methods in Python

Román de las Heras

Data Scientist, Appodeal

Basic parameters for bagging

Basic parameters
  • base_estimator
  • n_estimators
  • oob_score
    • est_bag.oob_score_
Ensemble Methods in Python

Additional parameters for bagging

Additional Parameters
  • max_samples: the number of samples to draw for each estimator.
  • max_features: the number of features to draw for each estimator.
    • Classification ~ sqrt(number_of_features)
    • Regression ~ number_of_features / 3
  • bootstrap: whether samples are drawn with replacement.
    • True --> max_samples = 1.0
    • False --> max_samples < 1.0
Ensemble Methods in Python

Random forest

Classification

from sklearn.ensemble import RandomForestClassifier

clf_rf = RandomForestClassifier(
    # parameters...
)

Regression

from sklearn.ensemble import RandomForestRegressor

reg_rf = RandomForestRegressor(
    # parameters...
)

Bagging parameters:

  • n_estimators
  • max_features
  • oob_score

Tree-specific parameters:

  • max_depth
  • min_samples_split
  • min_samples_leaf
  • class_weight ("balanced")
Ensemble Methods in Python

Bias-variance tradeoff

biasvariance.png

Ensemble Methods in Python

Let's practice!

Ensemble Methods in Python

Preparing Video For Download...