Bagging-parameters: tips en tricks

Ensemblemethoden in Python

Román de las Heras

Data Scientist, Appodeal

Basisparameters voor bagging

Basisparameters
  • base_estimator
  • n_estimators
  • oob_score
    • est_bag.oob_score_
Ensemblemethoden in Python

Extra parameters voor bagging

Extra parameters
  • max_samples: aantal samples per estimator.
  • max_features: aantal features per estimator.
    • Classificatie ~ sqrt(aantal_features)
    • Regressie ~ aantal_features / 3
  • bootstrap: of samples met teruglegging worden getrokken.
    • True --> max_samples = 1.0
    • False --> max_samples < 1.0
Ensemblemethoden in Python

Random forest

Classificatie

from sklearn.ensemble import RandomForestClassifier

clf_rf = RandomForestClassifier(
    # parameters...
)

Regressie

from sklearn.ensemble import RandomForestRegressor

reg_rf = RandomForestRegressor(
    # parameters...
)

Bagging-parameters:

  • n_estimators
  • max_features
  • oob_score

Boomspecifieke parameters:

  • max_depth
  • min_samples_split
  • min_samples_leaf
  • class_weight ("balanced")
Ensemblemethoden in Python

Bias-variance-afweging

Bias-variance-afweging

Ensemblemethoden in Python

Laten we oefenen!

Ensemblemethoden in Python

Preparing Video For Download...