Machine Learning per il marketing con Python
Karolis Urbonas
Head of Analytics & Science, Amazon

Importa il classificatore Logistic Regression
from sklearn.linear_model import LogisticRegression
Inizializza un’istanza di Logistic Regression
logreg = LogisticRegression()
Fitta il modello sui dati di training
logreg.fit(train_X, train_Y)
Metriche chiave:
from sklearn.metrics import accuracy_scorepred_train_Y = logreg.predict(train_X) pred_test_Y = logreg.predict(test_X)train_accuracy = accuracy_score(train_Y, pred_train_Y) test_accuracy = accuracy_score(test_Y, pred_test_Y)print('Training accuracy:', round(train_accuracy,4)) print('Test accuracy:', round(test_accuracy, 4))
Training accuracy: 0.8108
Test accuracy: 0.8009
from sklearn.metrics import precision_score, recall_scoretrain_precision = round(precision_score(train_Y, pred_train_Y), 4) test_precision = round(precision_score(test_Y, pred_test_Y), 4)train_recall = round(recall_score(train_Y, pred_train_Y), 4) test_recall = round(recall_score(test_Y, pred_test_Y), 4)print('Training precision: {}, Training recall: {}'.format(train_precision, train_recall)) print('Test precision: {}, Test recall: {}'.format(train_recall, test_recall))
Training precision: 0.6725, Training recall: 0.5736
Test precision: 0.5736, Test recall: 0.4835
LogisticRegression di sklearn usa L2 per defaultfrom sklearn.linear_model import LogisticRegression
logreg = LogisticRegression(penalty='l1', C=0.1, solver='liblinear')
logreg.fit(train_X, train_Y)
C va ottimizzato per trovare il valore miglioreC = [1, .5, .25, .1, .05, .025, .01, .005, .0025] l1_metrics = np.zeros((len(C), 5)) l1_metrics[:,0] = Cfor index in range(0, len(C)): logreg = LogisticRegression(penalty='l1', C=C[index], solver='liblinear') logreg.fit(train_X, train_Y) pred_test_Y = logreg.predict(test_X)l1_metrics[index,1] = np.count_nonzero(logreg.coef_) l1_metrics[index,2] = accuracy_score(test_Y, pred_test_Y) l1_metrics[index,3] = precision_score(test_Y, pred_test_Y) l1_metrics[index,4] = recall_score(test_Y, pred_test_Y)col_names = ['C','Coefficienti non zero','Accuracy','Precision','Recall'] print(pd.DataFrame(l1_metrics, columns=col_names)


Machine Learning per il marketing con Python