Dealing with Missing Data in Python
Suraj Donthi
Deep Learning & Computer Vision Consultant
Note: Used when the values are MCAR.
diabetes
DataFrame
768 rows × 9 columns
diabetes['Glucose'].mean()
121.687
diabetes.count()
763
diabetes['Glucose'].sum() /
diabetes['Glucose'].count()
121.687
diabetes
DataFrame
768 rows × 9 columns
diabetes.dropna(subset=['Glucose'],
how='any',
inplace=True)
msno.matrix(diabetes)
diabetes['Glucose'].isnull().sum()
5
diabetes.dropna(subset=["Glucose"], how='any', inplace=True)
msno.matrix(diabetes)
diabetes['BMI'].isnull().sum()
11
diabetes.dropna(subset=["BMI"], how='any', inplace=True)
msno.matrix(diabetes)
Dealing with Missing Data in Python