Credit acceptance rates

Credit Risk Modeling in Python

Michael Crabtree

Data Scientist, Ford Motor Company

Thresholds and loan status

Previously we set a threshold for a range of prob_default values
- This was used to change the predicted loan_status of the loan

preds_df['loan_status'] = preds_df['prob_default'].apply(lambda x: 1 if x > 0.4 else 0)

Loan	prob_default	threshold	loan_status
1	0.25	0.4	0
2	0.42	0.4	1
3	0.75	0.4	1

Use model predictions to set better thresholds
- Can also be used to approve or deny new loans
For all new loans, we want to deny probable defaults
- Use the test data as an example of new loans
Acceptance rate: what percentage of new loans are accepted to keep the number of defaults in a portfolio low
- Accepted loans which are defaults have an impact similar to false negatives

Histogram of distribution of predicted probabilities

import numpy as np
# Compute the threshold for 85% acceptance rate
threshold = np.quantile(prob_default, 0.85)

0.804

Loan	`prob_default`	Threshold	Predicted `loan_status`	Accept or Reject
1	0.65	0.804	0	Accept
2	0.85	0.804	1	Reject

# Compute the quantile on the probabilities of default
preds_df['loan_status'] = preds_df['prob_default'].apply(lambda x: 1 if x > 0.804 else 0)

Even with a calculated threshold, some of the accepted loans will be defaults
These are loans with prob_default values around where our model is not well calibrated

Bar of accepted loans with bad rate highlighted

Formula for bad rate

#Calculate the bad rate
np.sum(accepted_loans['true_loan_status']) / accepted_loans['true_loan_status'].count()

If non-default is 0, and default is 1 then the sum() is the count of defaults
The .count() of a single column is the same as the row count for the data frame

Credit Risk Modeling in Python