Machine Learning with caret in R
Zach Mayer
Data Scientist at DataRobot and co-author of caret
# Generate data with missing values
mtcars[mtcars$disp < 140, "hp"] <- NA
Y <- mtcars$mpg
X <- mtcars[, 2:4]
# Use median imputation
model <- train(X, Y, method = "glm", preProcess = "medianImpute")
print(min(model$results$RMSE))
3.612713
# Use KNN imputation
set.seed(42)
model <- train(
X, Y, method = "glm", preProcess = "knnImpute"
)
print(min(model$results$RMSE))
3.558881
Compare to 3.61 for median imputation
Machine Learning with caret in R