Evaluating model performance

Modeling with tidymodels in R

David Svancer

Data Scientist

Input to yardstick functions

All yardstick functions require a tibble with model results

Column with the true outcome variable values
- hwy for mpg data
Column with model predictions
- .pred

mpg_test_results

# A tibble: 57 x 3
     hwy   cty .pred
   <int> <int> <dbl>
 1    29    18  25.0
 2    31    20  27.7
 3    27    18  25.0
 4    26    18  25.0
 5    25    16  22.3
# ... with 47 more rows

Root mean squared error (RMSE)

RMSE estimates the average prediction error

Calculated with the rmse() function from yardstick
- Takes a tibble with model results
- truth is the column with true outcome values
- estimate is the column with predicted outcome values

mpg_test_results %>% 
  rmse(truth = hwy, estimate = .pred)

# A tibble: 1 x 3
  .metric .estimator .estimate
  <chr>   <chr>          <dbl>
1 rmse    standard        1.93

R squared metric

Measures the squared correlation between actual and predicted values

Also called the coefficient of determination
Ranges from 0 to 1
- When all predictions equal the true outcome values, R squared is 1
Calculated with the rsq() function from yardstick

mpg_test_results %>% 
  rsq(truth = hwy, estimate = .pred)

# A tibble: 1 x 3
  .metric .estimator .estimate
  <chr>   <chr>          <dbl>
1 rsq     standard       0.904

R squared plots

Visualization of the R squared metric

Model predictions versus the true outcome
The line y = x
- Represents R squared of 1
Used to find potential problems with model performance
- Non-linear patterns
- Regions where model is predicting poorly

Mpg model R squared plot

Plotting R squared plots

Making R squared plots with ggplot2

Tibble of model results
geom_point()
geom_abline()
coord_obs_pred()

ggplot(mpg_test_results, aes(x = hwy, y = .pred)) +

  geom_point() +

  geom_abline(color = 'blue', linetype = 2) +

  coord_obs_pred() +
  labs(title = 'R-Squared Plot',
       y = 'Predicted Highway MPG', 
       x = 'Actual Highway MPG')

Mpg model R squared plot

Streamlining model fitting

The last_fit() function

Takes a model specification, model formula, and data split object
Performs the following:
1. Creates training and test datasets
2. Fits the model to the training data
3. Calculates metrics and predictions on the test data
4. Returns an object with all results

lm_last_fit <- lm_model %>% 
  last_fit(hwy ~ cty, 
           split = mpg_split)

Collecting metrics

The collect_metrics() function

Takes the results of last_fit()
- Returns a tibble with performance metrics obtained on the test dataset
Default regression model metrics
- RMSE
- R squared

lm_last_fit %>% 
  collect_metrics()

# A tibble: 2 x 3
  .metric .estimator .estimate
  <chr>   <chr>          <dbl>
1 rmse    standard       1.93 
2 rsq     standard       0.904

Collecting predictions

The collect_predictions() function

Takes the results of last_fit()
- Returns a tibble with test dataset predictions
- Predictions column is named .pred
- Outcome variable and other row identifier columns included

lm_last_fit %>% 
  collect_predictions()

# A tibble: 57 x 4
   id               .pred  .row   hwy
   <chr>            <dbl> <int> <int>
 1 train/test split  25.0     1    29
 2 train/test split  27.7     3    31
 3 train/test split  25.0     7    27
 4 train/test split  25.0     8    26
 5 train/test split  22.3     9    25
# ... with 47 more rows

Let's evaluate some models!

Modeling with tidymodels in R