Dealing with Missing Data in Python
Suraj Donthi
Deep Learning & Computer Vision Consultant
Note $-$ (variable $\rightarrow$ data field or column in a DataFrame)
Definition:
"Missingness has no relationship between any values, observed or missing"
msno.matrix(diabetes)
Definition:
"There is a systematic relationship between missingness and other observed data, but not the missing data"
msno.matrix(diabetes)
Definition:
"There is a relationship between missingness and its values, missing or non-missing"
diabetes
sorted by Serum_Insulin
sorted = diabetes.sort_values('Serum_Insulin')
msno.matrix(sorted)
Dealing with Missing Data in Python