Supervised Learning in R: Regression
Nina Zumel and John Mount
Win-Vector, LCC
Multiple diverse decision trees averaged together
cnt ~ hr + holiday + workingday + 
  weathersit + temp + atemp + hum + windspeed

model <- ranger(fmla, bikesJan, 
                num.trees = 500, 
                respect.unordered.factors = "order")
formula, datanum.trees (default 500) - use at least 200mtry - number of variables to try at each noderespect.unordered.factors - recommend set to "order"model
Ranger result
...
OOB prediction error (MSE):       3103.623 
R squared (OOB):                  0.7837386
Random forest algorithm returns estimates of out-of-sample performance.
bikesFeb$pred <- predict(model, bikesFeb)$predictions
predict() inputs:
Predictions can be accessed in the element predictions.
Calculate RMSE:
bikesFeb %>% 
  mutate(residual = cnt - pred) %>%
  summarize(rmse = sqrt(mean(residual^2)))
      rmse
1 67.15169
| Model | RMSE | 
|---|---|
| Quasipoisson | 69.3 | 
| Random forests | 67.15 | 


Supervised Learning in R: Regression