Data splitting and confusion matrices

Credit Risk Modeling in R

Lore Dirick

Manager of Data Science Curriculum at Flatiron School

Start analysis

Screen Shot 2020-06-15 at 8.43.54 AM.png

Credit Risk Modeling in R

Training and test set

Screen Shot 2020-06-15 at 8.43.43 AM.png

Credit Risk Modeling in R

Training and test set

Screen Shot 2020-06-15 at 8.43.14 AM.png

Credit Risk Modeling in R

Cross-validation

Screen Shot 2020-06-15 at 8.43.31 AM.png

Credit Risk Modeling in R

Evaluate a model

      test_set$loan_status    model_prediction   
            ...                     ...
 [8066,]      1                       1
 [8067,]      0                       0
 [8068,]      0                       0
 [8069,]      0                       0
 [8070,]      0                       0
 [8071,]      0                       1
 [8072,]      1                       0
 [8073,]      1                       1
 [8074,]      0                       0
 [8075,]      0                       0
 [8076,]      0                       0
 [8077,]      1                       1
 [8078,]      0                       0
        ...                        ...
Credit Risk Modeling in R

Evaluate a model

      test_set$loan_status    model_prediction   
            ...                     ...
[8066,]       1                       1
[8067,]       0                       0
 [8068,]      0                       0
 [8069,]      0                       0
 [8070,]      0                       0
 [8071,]      0                       1
 [8072,]      1                       0
 [8073,]      1                       1
 [8074,]      0                       0
 [8075,]      0                       0
 [8076,]      0                       0
 [8077,]      1                       1
 [8078,]      0                       0
 [8079,]      0                       1
        ...                        ...

Actual loan status v. Model prediction

No default (0) Default (1)
No default (0) 8 2
Default (1) 1 3
Credit Risk Modeling in R

Evaluate a model

      test_set$loan_status    model_prediction   
            ...                     ...
[8066,]       1                       1
[8067,]       0                       0
 [8068,]      0                       0
 [8069,]      0                       0
 [8070,]      0                       0
 [8071,]      0                       1
 [8072,]      1                       0
 [8073,]      1                       1
 [8074,]      0                       0
 [8075,]      0                       0
 [8076,]      0                       0
 [8077,]      1                       1
 [8078,]      0                       0
 [8079,]      0                       1
        ...                        ...

Actual loan status v. Model prediction

No default (0) Default (1)
No default (0) TN FP
Default (1) FN TP
Credit Risk Modeling in R

Some measures...

  • Accuracy $$\frac{(8+3)}{14} = 78.57\%$$

  • Sensitivity $$\frac{3}{(1+3)} = 75\%$$

  • Specificity $$\frac{8}{(8+2)} = 80\%$$

Actual loan status v. Model prediction

No default (0) Default (1)
No default (0) 8 2
Default (1) 1 3
Credit Risk Modeling in R

Let's practice!

Credit Risk Modeling in R

Preparing Video For Download...