Evaluating Model Performance

Marketing Analytics: Predicting Customer Churn in Python

Mark Peterson

Director of Data Science, Infoblox

Accuracy

  • One possible metric: Accuracy
    • Total Number of Correct Predictions / Total Number of Data Points
  • What data to use?
    • Training data not representative of new data
Marketing Analytics: Predicting Customer Churn in Python

Training and Test Sets

  • Fit your classifier to the training set
  • Make predictions using the test set
Marketing Analytics: Predicting Customer Churn in Python

Training and Test Sets using scikit-learn

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(telco['data'], telco['target'], test_size=0.2, random_state = 42)
from sklearn.svm import SVC svc = SVC() svc.fit(X_train, y_train) svc.predict(X_test)
Marketing Analytics: Predicting Customer Churn in Python

Computing Accuracy

svc.score(X_test, y_test)
0.857
  • 85.7% accuracy: Quite good for a first try!
Marketing Analytics: Predicting Customer Churn in Python

Improving your model

  • Overfitting: Model fits the training data too closely
  • Underfitting: Does not capture trends in the training data
  • Need to find the right balance between overfitting and underfitting
Marketing Analytics: Predicting Customer Churn in Python

Let's practice!

Marketing Analytics: Predicting Customer Churn in Python

Preparing Video For Download...