Welcome to the course

Machine Learning with caret in R

Max Kuhn

Software Engineer at RStudio and creator of caret

Supervised Learning

  • R caret package
  • Automates supervised learning (a.k.a. predictive modeling)
  • Target variable

An picture of a flower and an invoice marked "past due".

Machine Learning with caret in R

Supervised Learning

  • Two types of predictive models
    • Classification ⇒ Qualitative
    • Regression ⇒ Quantitative
  • Use metrics to evaluate models
    • Quantifiable
    • Objective
  • Root Mean Squared Error (RMSE) for regression
Machine Learning with caret in R

Evaluating Model Performance

  • Common to calculate in-sample RMSE
    • Too optimistic
    • Leads to overfitting
  • Better to calculate out-of-sample error (a la caret)
    • Simulates real-world usage
    • Helps avoid overfitting
Machine Learning with caret in R

In-sample error

# Fit a model to the mtcars data
data(mtcars)
model <- lm(mpg ~ hp, mtcars[1:20, ])
# Predict in-sample
predicted <- predict(
  model, mtcars[1:20, ], type = "response"
)
# Calculate RMSE
actual <- mtcars[1:20, "mpg"]
sqrt(mean((predicted - actual) ^ 2))
3.172132
Machine Learning with caret in R

Let's practice!

Machine Learning with caret in R

Preparing Video For Download...