Measuring performance

Introduction to Anomaly Detection in R

Alastair Rushworth

Data Scientist

Using a decision threshold

Choose a high value

high_score <- quantile(sat$score, probs = 0.99)
high_score
    99% 
0.6228078

Binarize score

sat$binary_score <- as.numeric(score >= high_score)
Introduction to Anomaly Detection in R

Tables of agreement

Comparing true label and binarized score

table(sat$label, sat$binary_score)
       0    1
  0 5729    3
  1   15   56

  • 56 out of 71 anomalies found
Introduction to Anomaly Detection in R

Recall

Anomalies correctly identified $\div$ Total anomalies

  • 1 = Perfect recall; every anomaly detected by algorithm
table(sat$label, sat$binary_score)
       0    1
  0 5729    3
  1   15   56
recall <- 56 / (15 + 56)
recall
0.7887324
Introduction to Anomaly Detection in R

Precision

Anomalies correctly identified $\mathbf{\div}$ Total scored as anomalous

  • 1 = Perfect precision; no normal instances incorrectly labeled
table(sat$label, sat$binary_score)
       0    1
  0 5729    3
  1   15   56
precision <- 56 / (56 + 3)
precision
0.9491525
Introduction to Anomaly Detection in R

Let's practice!

Introduction to Anomaly Detection in R

Preparing Video For Download...