Machine learning avec des modèles arborescents en Python
Elie Kawerk
Data Scientist
Classificateur de vote
Bagging
Bagging : Agrégation Bootstrap.
Utilise une technique appelée « bootstrap ».
Réduit la variance des modèles individuels dans l'ensemble.



Classification :
BaggingClassifier dans scikit-learn.Régression :
BaggingRegressor dans scikit-learn.# Import models and utility functions
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
# Set seed for reproducibility
SEED = 1
# Split data into 70% train and 30% test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,
stratify=y,
random_state=SEED)
# Instantiate a classification-tree 'dt' dt = DecisionTreeClassifier(max_depth=4, min_samples_leaf=0.16, random_state=SEED)# Instantiate a BaggingClassifier 'bc' bc = BaggingClassifier(base_estimator=dt, n_estimators=300, n_jobs=-1)# Fit 'bc' to the training set bc.fit(X_train, y_train) # Predict test set labels y_pred = bc.predict(X_test) # Evaluate and print test-set accuracy accuracy = accuracy_score(y_test, y_pred) print('Accuracy of Bagging Classifier: {:.3f}'.format(accuracy))
Accuracy of Bagging Classifier: 0.936
Machine learning avec des modèles arborescents en Python