Credit Risk Modeling in Python
Michael Crabtree
Data Scientist, Ford Motor Company
''| Missing Data Type | Possible Result |
|---|---|
| NULL in numeric column | Error |
| NULL in string column | Error |
| Missing Data | Interpretation | Action |
|---|---|---|
NULL in loan_status |
Loan recently approved | Remove from prediction data |
NULL in person_age |
Age not recorded or disclosed | Replace with median |
isnull() functionsum() function.any() method checks all columnsnull_columns = cr_loan.columns[cr_loan.isnull().any()]
cr_loan[null_columns].isnull().sum()
# Total number of null values per column
person_home_ownership 25
person_emp_length 895
loan_intent 25
loan_int_rate 3140
cb_person_default_on_file 15
.fillna() with aggregate functions and methodscr_loan['loan_int_rate'].fillna((cr_loan['loan_int_rate'].mean()), inplace = True)
.drop() methodindices = cr_loan[cr_loan['person_emp_length'].isnull()].index
cr_loan.drop(indices, inplace=True)
Credit Risk Modeling in Python