Credit Risk Modeling in Python
Michael Crabtree
Data Scientist, Ford Motor Company
prob_default valuesloan_status of the loanpreds_df['loan_status'] = preds_df['prob_default'].apply(lambda x: 1 if x > 0.4 else 0)
| Loan | prob_default | threshold | loan_status |
|---|---|---|---|
| 1 | 0.25 | 0.4 | 0 |
| 2 | 0.42 | 0.4 | 1 |
| 3 | 0.75 | 0.4 | 1 |
prob_defaultimport numpy as np
# Compute the threshold for 85% acceptance rate
threshold = np.quantile(prob_default, 0.85)
0.804
| Loan | prob_default |
Threshold | Predicted loan_status |
Accept or Reject |
|---|---|---|---|---|
| 1 | 0.65 | 0.804 | 0 | Accept |
| 2 | 0.85 | 0.804 | 1 | Reject |
loan_status values using the new threshold# Compute the quantile on the probabilities of default
preds_df['loan_status'] = preds_df['prob_default'].apply(lambda x: 1 if x > 0.804 else 0)
prob_default values around where our model is not well calibrated#Calculate the bad rate
np.sum(accepted_loans['true_loan_status']) / accepted_loans['true_loan_status'].count()
0, and default is 1 then the sum() is the count of defaults.count() of a single column is the same as the row count for the data frameCredit Risk Modeling in Python