Model evaluation

Predicting CTR with Machine Learning in Python

Kevin Huo

Instructor

Precision and recall

  • Precision: ROI on ad spend through clicks

    • Low precision means very little tangible ROI on clicks
  • Recall: targeting relevant audience

    • Low recall means missed out opportunities on ROI
  • It may be sensible to weight the two differently

    • Companies are likely to care more about avoiding low precision compared to low recall
Predicting CTR with Machine Learning in Python

F-beta score

$$F_\beta = (1+\beta^2)\cdot\frac{\text{precision}\cdot\text{recall}}{(\beta^2 \cdot \text{precision}) + \text{recall}}$$

  • Beta coefficient: represents relative weighting of two metrics

    • Beta between 0 and 1 means precision is made smaller and hence weighted more, whereas beta > 1 means precision is made larger and hence weighted less
  • Implementation available in sklearn via: fbeta_score(y_true, y_pred, beta)

    • y_true is true targets and y_pred the predicted targets
Predicting CTR with Machine Learning in Python

AUC of ROC curve versus precision

roc_auc = roc_auc_score(y_test, y_score[:, 1])

fpr = 1 - tn / (tn + fp)
precision = tp / (tp + fp)
  • Imbalanced dataset: fpr can be low when precision is also low.
  • Let us assume we have 100 TN, and 10 TP and 10 FP.
fpr = 1 - 100 / (100 + 10) = 0.091
precision = tp / (tp + fp) = 0.5
  • Low FPR can lead to high AUC of ROC curve, despite precision being low! Therefore it is important to look at both metrics, along with F-beta score
Predicting CTR with Machine Learning in Python

ROI on ad spend

  • Same idea from prior: some cost c and return r
total_return = tp * r
total_spent = (tp + fp) * cost
roi = total_return / total_spent 
    = (tp) / (tp + fp) * (r / cost) 
    = precision * (r / cost)
Predicting CTR with Machine Learning in Python

Let's practice!

Predicting CTR with Machine Learning in Python

Preparing Video For Download...