What is Boosting?

Extreme Gradient Boosting with XGBoost

Sergey Fogelson

Head of Data Science, TelevisaUnivision

Boosting overview

  • Not a specific machine learning algorithm
  • Concept that can be applied to a set of machine learning models
    • "Meta-algorithm"
  • Ensemble meta-algorithm used to convert many weak learners into a strong learner
Extreme Gradient Boosting with XGBoost

Weak learners and strong learners

  • Weak learner: ML algorithm that is slightly better than chance
    • Example: Decision tree whose predictions are slightly better than 50%
  • Boosting converts a collection of weak learners into a strong learner
  • Strong learner: Any algorithm that can be tuned to achieve good performance
Extreme Gradient Boosting with XGBoost

How boosting is accomplished

  • Iteratively learning a set of weak models on subsets of the data
  • Weighing each weak prediction according to each weak learner's performance
  • Combine the weighted predictions to obtain a single weighted prediction
  • ... that is much better than the individual predictions themselves!
Extreme Gradient Boosting with XGBoost

Boosting example

1 https://xgboost.readthedocs.io/en/latest/model.html
Extreme Gradient Boosting with XGBoost

Model evaluation through cross-validation

  • Cross-validation: Robust method for estimating the performance of a model on unseen data
  • Generates many non-overlapping train/test splits on training data
  • Reports the average test set performance across all data splits
Extreme Gradient Boosting with XGBoost

Cross-validation in XGBoost example

import xgboost as xgb
import pandas as pd

churn_data = pd.read_csv("classification_data.csv")
churn_dmatrix = xgb.DMatrix(data=churn_data.iloc[:,:-1], label=churn_data.month_5_still_here)
params={"objective":"binary:logistic","max_depth":4}
cv_results = xgb.cv(dtrain=churn_dmatrix, params=params, nfold=4, num_boost_round=10, metrics="error", as_pandas=True)
print("Accuracy: %f" %((1-cv_results["test-error-mean"]).iloc[-1]))
Accuracy: 0.88315
Extreme Gradient Boosting with XGBoost

Let's practice!

Extreme Gradient Boosting with XGBoost

Preparing Video For Download...