Gradient boosting flavors

Ensemble Methods in Python

Román de las Heras

Data Scientist, Appodeal

Variations of gradient boosting

Gradient Boosting Algorithm

  • Extreme Gradient Boosting
  • Light Gradient Boosting Machine
  • Categorical Boosting

Implementation

  • XGBoost
  • LightGBM
  • CatBoost
Ensemble Methods in Python

Extreme gradient boosting (XGBoost)

Some properties:

  • Optimized for distributed computing
  • Parallel training by nature
  • Scalable, portable, and accurate
import xgboost as xgb

clf_xgb = xgb.XGBClassifier(
   n_estimators=100,
   learning_rate=None,
   max_depth=None,
   random_state
)
clg_xgb.fit(X_train, y_train)
pred = clf_xgb.predict(X_test)
Ensemble Methods in Python

Light gradient boosting machine

Some properties:

  • Released by Microsoft (2017)
  • Faster training and more efficient
  • Lighter in terms of space
  • Optimized for parallel and GPU processing
  • Useful for problems with big datasets and constraints of speed or memory
import lightgbm as lgb

clf_lgb = lgb.LGBMClassifier(
   n_estimators=100,
   learning_rate=0.1,
   max_depth=-1,
   random_state
)
clf_lgb.fit(X_train, y_train)
pred = clf_lgb.predict(X_test)
Ensemble Methods in Python

Categorical boosting

Some properties:

  • Open sourced by Yandex (April 2017)
  • Built-in handling of categorical features
  • Accurate and robust
  • Fast and scalable
  • User-friendly API
import catboost as cb

clf_cat = cb.CatBoostClassifier(
   n_estimators=None,
   learning_rate=None,
   max_depth=None,
   random_state
)
clf_cat.fit(X_train, y_train)
pred = clf_cat.predict(X_test)
Ensemble Methods in Python

It's your turn!

Ensemble Methods in Python

Preparing Video For Download...