Machine Learning with Tree-Based Models in R
Sandro Raabe
Data Scientist
$\Rightarrow$ Measure how far predictions are away from truth
MAE = average length of the red bars
$$MAE = \frac{1}{n} \sum_{i=1}^n\left| actual_i - predicted_i \right|$$
$$\quad MSE = \quad \frac{1}{n} \sum_{i=1}^n\left( actual_i - predicted_i \right)^2$$
$$MAE = \frac{1}{n} \sum_{i=1}^n\left| actual_i - predicted_i \right|$$
$$RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^n\left( actual - predicted \right)^2}$$
# parsnip and yardstick are included in tidymodels
library(tidymodels)
# Make predictions and add to test data predictions <- predict(model, new_data = chocolate_test) %>%
bind_cols(chocolate_test)
# A tibble: 358 x 7
.pred final_grade review_date cocoa_percent company_location
<dbl> <dbl> <int> <dbl> <fct>
1 2.5 2.75 2013 0.7 France
2 3.64 3.25 2014 0.8 France
3 3.3 3.5 2012 0.7 France
4 3.25 3.5 2011 0.72 Fiji
# ... with 354 more rows, and 2 more variables: bean_type <fct>, broad_bean_origin <fct>
# Evaluate using mae() mae(predictions,
estimate = .pred,
truth = final_grade)
# A tibble: 1 x 2
.metric .estimate
<chr> <dbl>
1 mae 0.363
# Evaluate using rmse()
rmse(predictions,
estimate = .pred,
truth = final_grade)
# A tibble: 1 x 2
.metric .estimate
<chr> <dbl>
1 rmse 0.457
Machine Learning with Tree-Based Models in R