Introduction and data structure

Credit Risk Modeling in R

Lore Dirick

Manager of Data Science Curriculum at Flatiron School

What is loan default?

Screen Shot 2020-06-11 at 3.34.36 PM.png

Credit Risk Modeling in R

What is loan default?

Screen Shot 2020-06-11 at 3.34.10 PM.png

Credit Risk Modeling in R

What is loan default?

Screen Shot 2020-06-11 at 3.34.19 PM.png

Credit Risk Modeling in R

Components of expected loss (EL)

  • Probability of default (PD)
  • Exposure at default (EAD)
  • Loss given default (LGD)

$$

$$\text{EL}= \text{PD} \times \text{EAD} \times \text{LGD}$$

Credit Risk Modeling in R

Components of expected loss (EL)

  • Probability of default (PD)
  • Exposure at default (EAD)
  • Loss given default (LGD)

$$

$$\text{EL}= \text{PD} \times \text{EAD} \times \text{LGD}$$

Credit Risk Modeling in R

Information used by banks

  • Application information:
    • Income
    • Marital status
    • ...
  • Behavioral information
    • Current account balance
    • Payment arrears in account history
    • ...
Credit Risk Modeling in R
head(loan_data, 10)
   loan_status loan_amnt int_rate grade emp_length home_ownership annual_inc age
1            0      5000    10.65     B         10           RENT      24000  33
2            0      2400       NA     C         25           RENT      12252  31
3            0     10000    13.49     C         13           RENT      49200  24
4            0      5000       NA     A          3           RENT      36000  39
5            0      3000       NA     E          9           RENT      48000  24
6            0     12000    12.69     B         11            OWN      75000  28
7            1      9000    13.49     C          0           RENT      30000  22
8            0      3000     9.91     B          3           RENT      15000  22
9            1     10000    10.65     B          3           RENT     100000  28
10           0      1000    16.29     D          0           RENT      28000  22
Credit Risk Modeling in R
library(gmodels)
CrossTable(loan_data$home_ownership)
   Cell Contents
|-------------------------|
|                       N |
|         N / Table Total |
|-------------------------|

Total Observations in Table:  29092 

          |  MORTGAGE |     OTHER |       OWN |      RENT | 
          |-----------|-----------|-----------|-----------|
          |     12002 |        97 |      2301 |     14692 | 
          |     0.413 |     0.003 |     0.079 |     0.505 | 
          |-----------|-----------|-----------|-----------|
Credit Risk Modeling in R
CrossTable(loan_data$home_ownership, loan_data$loan_status, prop.r = TRUE,
           prop.c = FALSE, prop.t = FALSE, prop.chisq = FALSE)
                         | loan_data$loan_status 
loan_data$home_ownership |         0 |         1 | Row Total | 
 ------------------------|-----------|-----------|-----------|
                MORTGAGE |     10821 |      1181 |     12002 | 
                         |     0.902 |     0.098 |     0.413 | 
 ------------------------|-----------|-----------|-----------|
                   OTHER |        80 |        17 |        97 | 
                         |     0.825 |     0.175 |     0.003 | 
 ------------------------|-----------|-----------|-----------|
                     OWN |      2049 |       252 |      2301 | 
                         |     0.890 |     0.110 |     0.079 | 
 ------------------------|-----------|-----------|-----------|
                    RENT |     12915 |      1777 |     14692 | 
                         |     0.879 |     0.121 |     0.505 | 
 ------------------------|-----------|-----------|-----------|
            Column Total |     25865 |      3227 |     29092 |
Credit Risk Modeling in R

Let's practice!

Credit Risk Modeling in R

Preparing Video For Download...