Decision-Tree for Classification

Machine Learning with Tree-Based Models in Python

Elie Kawerk

Data Scientist

Course Overview

  • Chap 1: Classification And Regression Tree (CART)

  • Chap 2: The Bias-Variance Tradeoff

  • Chap 3: Bagging and Random Forests

  • Chap 4: Boosting

  • Chap 5: Model Tuning

Machine Learning with Tree-Based Models in Python

Classification-tree

  • Sequence of if-else questions about individual features.

  • Objective: infer class labels.

  • Able to capture non-linear relationships between features and labels.

  • Don't require feature scaling (ex: Standardization, ..)

Machine Learning with Tree-Based Models in Python

Breast Cancer Dataset in 2D

BC2D

Machine Learning with Tree-Based Models in Python

Decision-tree Diagram

CART-rep

Machine Learning with Tree-Based Models in Python

Classification-tree in scikit-learn

# Import DecisionTreeClassifier
from sklearn.tree import DecisionTreeClassifier
# Import train_test_split
from sklearn.model_selection import train_test_split
# Import accuracy_score
from sklearn.metrics import accuracy_score

# Split the dataset into 80% train, 20% test X_train, X_test, y_train, y_test= train_test_split(X, y, test_size=0.2, stratify=y, random_state=1)
# Instantiate dt dt = DecisionTreeClassifier(max_depth=2, random_state=1)
Machine Learning with Tree-Based Models in Python

Classification-tree in scikit-learn

# Fit dt to the training set
dt.fit(X_train,y_train) 

# Predict the test set labels
y_pred = dt.predict(X_test)

# Evaluate the test-set accuracy accuracy_score(y_test, y_pred)
0.90350877192982459
Machine Learning with Tree-Based Models in Python

Decision Regions

Decision region: region in the feature space where all instances are assigned to one class label.

Decision Boundary: surface separating different decision regions.

DR

Machine Learning with Tree-Based Models in Python

Decision Regions: CART vs. Linear Model

LRvsDT

Machine Learning with Tree-Based Models in Python

Let's practice!

Machine Learning with Tree-Based Models in Python

Preparing Video For Download...