Prediksi churn dengan decision tree

Machine Learning untuk Pemasaran dengan Python

Karolis Urbonas

Head of Analytics & Science, Amazon

Pengenalan decision tree

Aturan Decision Tree pada dataset Kelangsungan Hidup Titanic

Machine Learning untuk Pemasaran dengan Python

Langkah pemodelan

  1. Pisahkan data menjadi train dan test
  2. Inisialisasi model
  3. Latih model pada data train
  4. Prediksi nilai pada data test
  5. Ukur kinerja model pada data test
Machine Learning untuk Pemasaran dengan Python

Melatih model

Impor modul decision tree

from sklearn.tree import DecisionTreeClassifier

Inisialisasi model Decision Tree

mytree = DecisionTreeClassifier()

Latih model pada data train

treemodel = mytree.fit(train_X, train_Y)
Machine Learning untuk Pemasaran dengan Python

Mengukur akurasi model

from sklearn.metrics import accuracy_score

pred_train_Y = mytree.predict(train_X) pred_test_Y = mytree.predict(test_X)
train_accuracy = accuracy_score(train_Y, pred_train_Y) test_accuracy = accuracy_score(test_Y, pred_test_Y)
print('Training accuracy:', round(train_accuracy,4)) print('Test accuracy:', round(test_accuracy, 4))
Training accuracy: 0.9973
Test accuracy: 0.7196
Machine Learning untuk Pemasaran dengan Python

Mengukur precision dan recall

from sklearn.metrics import precision_score, recall_score

train_precision = round(precision_score(train_Y, pred_train_Y), 4) test_precision = round(precision_score(test_Y, pred_test_Y), 4)
train_recall = round(recall_score(train_Y, pred_train_Y), 4) test_recall = round(recall_score(test_Y, pred_test_Y), 4)
print('Training precision: {}, Training recall: {}'.format(train_precision, train_recall)) print('Test precision: {}, Test recall: {}'.format(train_recall, test_recall))
Training precision: 0.9993, Training recall: 0.9906
Test precision: 0.9906, Test recall: 0.4878
Machine Learning untuk Pemasaran dengan Python

Tuning parameter kedalaman pohon

depth_list = list(range(2,15))
depth_tuning = np.zeros((len(depth_list), 4))
depth_tuning[:,0] = depth_list

for index in range(len(depth_list)): mytree = DecisionTreeClassifier(max_depth=depth_list[index]) mytree.fit(train_X, train_Y) pred_test_Y = mytree.predict(test_X)
depth_tuning[index,1] = accuracy_score(test_Y, pred_test_Y) depth_tuning[index,2] = precision_score(test_Y, pred_test_Y) depth_tuning[index,3] = recall_score(test_Y, pred_test_Y)
col_names = ['Max_Depth','Accuracy','Precision','Recall'] print(pd.DataFrame(depth_tuning, columns=col_names))
Machine Learning untuk Pemasaran dengan Python

Memilih kedalaman optimal

Tuning Max Depth

Machine Learning untuk Pemasaran dengan Python

Memilih kedalaman optimal

Tuning Max Depth

Machine Learning untuk Pemasaran dengan Python

Mari membangun decision tree!

Machine Learning untuk Pemasaran dengan Python

Preparing Video For Download...