Matrice di confusione

Machine Learning con caret in R

Zach Mayer

Data Scientist at DataRobot and co-author of caret

Matrice di confusione

Una matrice di confusione tra previsione e riferimento. Quando previsione e riferimento sono entrambi "sì", è un vero positivo. Quando sono entrambi "no", è un vero negativo. Quando la previsione è "sì" e il riferimento è "no", è un falso positivo. Quando la previsione è "no" e il riferimento è "sì", è un falso negativo.

Matrice di confusione

# Fit a model
model <- glm(Class ~ ., family = binomial(link = "logit"), train)
p <- predict(model, test, type = "response")
summary(p)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.0000  0.0000  0.9885  0.5296  1.0000  1.0000

# Turn probabilities into classes and look at their frequencies
p_class <- ifelse(p > 0.50, "M", "R")
table(p_class)
p_class

 M  R 
44 39

Matrice di confusione

Crea una tabella di contingenza 2x2
Confronta classi predette vs. reali

# Make simple 2-way frequency table
table(p_class, test[["Class"]])

p_class  M  R
      M 13 31
      R 30  9

Matrice di confusione

# Use caret’s helper function to calculate additional statistics
confusionMatrix(p_class, test[["Class"]])

         Reference
Prediction  M  R
         M 13 31
         R 30  9

               Accuracy : 0.2651          
                 95% CI : (0.1742, 0.3734)
    No Information Rate : 0.5181          
    P-Value [Acc > NIR] : 1               

                  Kappa : -0.4731         
 Mcnemar's Test P-Value : 1               

            Sensitivity : 0.3023          
            Specificity : 0.2250          
         Pos Pred Value : 0.2955          
         Neg Pred Value : 0.2308

Passiamo alla pratica!

Machine Learning con caret in R