Credit Risk Modeling in Python
Michael Crabtree
Data Scientist, Ford Motor Company
prob_default
valuesloan_status
of the loanpreds_df['loan_status'] = preds_df['prob_default'].apply(lambda x: 1 if x > 0.4 else 0)
Loan | prob_default | threshold | loan_status |
---|---|---|---|
1 | 0.25 | 0.4 | 0 |
2 | 0.42 | 0.4 | 1 |
3 | 0.75 | 0.4 | 1 |
prob_default
import numpy as np
# Compute the threshold for 85% acceptance rate
threshold = np.quantile(prob_default, 0.85)
0.804
Loan | prob_default |
Threshold | Predicted loan_status |
Accept or Reject |
---|---|---|---|---|
1 | 0.65 | 0.804 | 0 | Accept |
2 | 0.85 | 0.804 | 1 | Reject |
loan_status
values using the new threshold# Compute the quantile on the probabilities of default
preds_df['loan_status'] = preds_df['prob_default'].apply(lambda x: 1 if x > 0.804 else 0)
prob_default
values around where our model is not well calibrated#Calculate the bad rate
np.sum(accepted_loans['true_loan_status']) / accepted_loans['true_loan_status'].count()
0
, and default is 1
then the sum()
is the count of defaults.count()
of a single column is the same as the row count for the data frameCredit Risk Modeling in Python