Dealing With Missing Data in R
Nicholas Tierney
Statistician
MCAR Missing Completely at Random
MAR Missing At Random
MNAR Missing Not At Random
Missingness has no association with any data you have observed, or not observed.
test | vacation |
---|---|
NA | TRUE |
11.533340 | FALSE |
10.126115 | TRUE |
NA | FALSE |
NA | TRUE |
8.551881 | FALSE |
NA | FALSE |
NA | TRUE |
10.608264 | TRUE |
Implications
Missingness depends on data observed, but not data observed
Implications:
test | vacation | depression |
---|---|---|
NA | TRUE | 87.93109 |
11.533340 | FALSE | 40.02708 |
10.126115 | TRUE | 48.62883 |
NA | FALSE | 88.21743 |
NA | TRUE | 90.29282 |
8.551881 | FALSE | 44.77343 |
NA | FALSE | 89.48865 |
NA | TRUE | 89.99209 |
10.608264 | TRUE | 45.56832 |
Missingness of the response is related to an unobserved value relevant to the assessment of interest.
Implications:
test | vacation | depression |
---|---|---|
NA | TRUE | NA |
11.533340 | FALSE | 11.533340 |
10.126115 | TRUE | 10.126115 |
NA | FALSE | NA |
NA | TRUE | NA |
8.551881 | FALSE | 8.551881 |
NA | FALSE | NA |
NA | TRUE | NA |
10.608264 | TRUE | 10.608264 |
vis_miss(mt_cars, cluster = TRUE)
oceanbuoys %>% arrange(year) %>% vis_miss()
vis_miss(ocean, cluster = TRUE)
Dealing With Missing Data in R