Class probabilities and predictions

Machine Learning with caret in R

Zach Mayer

Data Scientist at DataRobot and co-author of caret

Different thresholds

  • Not limited to 50% threshold
    • 10% would catch more mines with less certainty
    • 90% would catch fewer mines with more certainty
  • Balance true positive and false positive rates
  • Cost-benefit analysis
Machine Learning with caret in R

Confusion matrix

# Use a larger cutoff
p_class <- ifelse(p > 0.99, "M", "R")
table(p_class)
p_class
 M  R 
41 42 
# Make simple 2-way frequency table
table(p_class, test[["Class"]])
p_class  M  R
      M 13 28
      R 30 12
Machine Learning with caret in R

Confusion matrix with caret

# Use caret to produce confusion matrix
confusionMatrix(p_class, test[["Class"]])
          Reference
Prediction  M  R
         M 13 28
         R 30 12

               Accuracy : 0.3012          
                 95% CI : (0.2053, 0.4118)
    No Information Rate : 0.5181          
    P-Value [Acc > NIR] : 1.0000          

                  Kappa : -0.397          
 Mcnemar's Test P-Value : 0.8955          

            Sensitivity : 0.3023          
            Specificity : 0.3000          
         Pos Pred Value : 0.3171          
         Neg Pred Value : 0.2857
Machine Learning with caret in R

Let’s practice!

Machine Learning with caret in R

Preparing Video For Download...