Credit Risk Modeling in Python
Michael Crabtree
Data Scientist, Ford Motor Company
''
Missing Data Type | Possible Result |
---|---|
NULL in numeric column | Error |
NULL in string column | Error |
Missing Data | Interpretation | Action |
---|---|---|
NULL in loan_status |
Loan recently approved | Remove from prediction data |
NULL in person_age |
Age not recorded or disclosed | Replace with median |
isnull()
functionsum()
function.any()
method checks all columnsnull_columns = cr_loan.columns[cr_loan.isnull().any()]
cr_loan[null_columns].isnull().sum()
# Total number of null values per column
person_home_ownership 25
person_emp_length 895
loan_intent 25
loan_int_rate 3140
cb_person_default_on_file 15
.fillna()
with aggregate functions and methodscr_loan['loan_int_rate'].fillna((cr_loan['loan_int_rate'].mean()), inplace = True)
.drop()
methodindices = cr_loan[cr_loan['person_emp_length'].isnull()].index
cr_loan.drop(indices, inplace=True)
Credit Risk Modeling in Python