Dealing with Missing Data in Python
Suraj Donthi
Deep Learning & Computer Vision Consultant
Note: Used when the values are MCAR.
diabetes DataFrame
768 rows × 9 columns
diabetes['Glucose'].mean()
121.687
diabetes.count()
763
diabetes['Glucose'].sum() /
diabetes['Glucose'].count()
121.687
diabetes DataFrame
768 rows × 9 columns
diabetes.dropna(subset=['Glucose'],
how='any',
inplace=True)
msno.matrix(diabetes)diabetes['Glucose'].isnull().sum()
5

diabetes.dropna(subset=["Glucose"], how='any', inplace=True)
msno.matrix(diabetes)

diabetes['BMI'].isnull().sum()
11
diabetes.dropna(subset=["BMI"], how='any', inplace=True)
msno.matrix(diabetes)

Dealing with Missing Data in Python