Modelagem de Risco de Crédito em Python
Michael Crabtree
Data Scientist, Ford Motor Company
# Defina todas as taxas de aprovação a testar
accept_rates = [1.0, 0.95, 0.9, 0.85, 0.8, 0.75, 0.7, 0.65, 0.6, 0.55,
0.5, 0.45, 0.4, 0.35, 0.3, 0.25, 0.2, 0.15, 0.1, 0.05]
# Listas para armazenar limiares e bad rates
thresholds = []
bad_rates = []
for rate in accept_rates:
# Calcular limiar
threshold = np.quantile(preds_df['prob_default'], rate).round(3)
# Guardar o limiar na lista
thresholds.append(np.quantile(preds_gbt['prob_default'], rate).round(3))
# Aplicar o limiar para reclassificar loan_status
test_pred_df['pred_loan_status'] = \
test_pred_df['prob_default'].apply(lambda x: 1 if x > thresh else 0)
# Conjunto de empréstimos aceitos (previstos sem default)
accepted_loans = test_pred_df[test_pred_df['pred_loan_status'] == 0]
# Calcular e guardar a bad rate
bad_rates.append(np.sum((accepted_loans['true_loan_status'])
/ accepted_loans['true_loan_status'].count()).round(3))
strat_df = pd.DataFrame(zip(accept_rates, thresholds, bad_rates),
columns = ['Acceptance Rate','Threshold','Bad Rate'])
len() ou .count()loan_amnt no conjunto de testeloan_amnt# Probability of default (PD)
test_pred_df['prob_default']
# Exposure at default = loan amount (EAD)
test_pred_df['loan_amnt']
# Loss given default = 1.0 for total loss (LGD)
test_pred_df['loss_given_default']
Modelagem de Risco de Crédito em Python