Supervised Learning in R: Regression
Nina Zumel and John Mount
Win-Vector LLC


Which is best?
anx ~ I(hassles^2)anx ~ I(hassles^3)anx ~ I(hassles^2) + I(hassles^3)anx ~ exp(hassles)I(): treat an expression literally (not as an interaction)
Linear, Quadratic, and Cubic models
mod_lin <- lm(anx ~ hassles, hassleframe)
summary(mod_lin)$r.squared
0.5334847
mod_quad <- lm(anx ~ I(hassles^2), hassleframe)
summary(mod_quad)$r.squared
0.6241029
mod_tritic <- lm(anx ~ I(hassles^3), hassleframe)
summary(mod_tritic)$r.squared
0.6474421
Use cross-validation to evaluate the models
| Model | RMSE | 
|---|---|
| Linear ($hassles$) | 7.69 | 
| Quadratic ($hassles^2$) | 6.89 | 
| Cubic ($hassles^3$) | 6.70 | 
Supervised Learning in R: Regression