Voorspel churn met beslisbomen

Machine Learning voor marketing in Python

Karolis Urbonas

Head of Analytics & Science, Amazon

Introductie tot beslisbomen

Beslisboomregels op Titanic-overlevingsdata

Machine Learning voor marketing in Python

Stappen voor modelleren

  1. Data splitsen in train en test
  2. Model initialiseren
  3. Model fitten op traindata
  4. Voorspellen op testdata
  5. Meten van prestatie op testdata
Machine Learning voor marketing in Python

Model fitten

Importeer de beslisboommodule

from sklearn.tree import DecisionTreeClassifier

Initialiseer het Decision Tree-model

mytree = DecisionTreeClassifier()

Fit het model op de traindata

treemodel = mytree.fit(train_X, train_Y)
Machine Learning voor marketing in Python

Modelnauwkeurigheid meten

from sklearn.metrics import accuracy_score

pred_train_Y = mytree.predict(train_X) pred_test_Y = mytree.predict(test_X)
train_accuracy = accuracy_score(train_Y, pred_train_Y) test_accuracy = accuracy_score(test_Y, pred_test_Y)
print('Training accuracy:', round(train_accuracy,4)) print('Test accuracy:', round(test_accuracy, 4))
Training accuracy: 0.9973
Test accuracy: 0.7196
Machine Learning voor marketing in Python

Precisie en recall meten

from sklearn.metrics import precision_score, recall_score

train_precision = round(precision_score(train_Y, pred_train_Y), 4) test_precision = round(precision_score(test_Y, pred_test_Y), 4)
train_recall = round(recall_score(train_Y, pred_train_Y), 4) test_recall = round(recall_score(test_Y, pred_test_Y), 4)
print('Training precision: {}, Training recall: {}'.format(train_precision, train_recall)) print('Test precision: {}, Test recall: {}'.format(train_recall, test_recall))
Training precision: 0.9993, Training recall: 0.9906
Test precision: 0.9906, Test recall: 0.4878
Machine Learning voor marketing in Python

Afstemmen boomdiepte

depth_list = list(range(2,15))
depth_tuning = np.zeros((len(depth_list), 4))
depth_tuning[:,0] = depth_list

for index in range(len(depth_list)): mytree = DecisionTreeClassifier(max_depth=depth_list[index]) mytree.fit(train_X, train_Y) pred_test_Y = mytree.predict(test_X)
depth_tuning[index,1] = accuracy_score(test_Y, pred_test_Y) depth_tuning[index,2] = precision_score(test_Y, pred_test_Y) depth_tuning[index,3] = recall_score(test_Y, pred_test_Y)
col_names = ['Max_Depth','Accuracy','Precision','Recall'] print(pd.DataFrame(depth_tuning, columns=col_names))
Machine Learning voor marketing in Python

Optimale diepte kiezen

Afstemmen max. diepte

Machine Learning voor marketing in Python

Optimale diepte kiezen

Afstemmen max. diepte

Machine Learning voor marketing in Python

Laten we een beslisboom bouwen!

Machine Learning voor marketing in Python

Preparing Video For Download...