Modeling with tidymodels in R
David Svancer
Data Scientist
The last_fit() function
Similar to using fit(), the first steps include:
rsampleparsnipleads_split <- initial_split(leads_df, strata = purchased)logistic_model <- logistic_reg() %>% set_engine('glm') %>% set_mode('classification')
The last_fit() function
parsnip model object
The collect_metrics() function calculates metrics using the test dataset
logistic_last_fit <- logistic_model %>% last_fit(purchased ~ total_visits + total_time, split = leads_split)logistic_last_fit %>% collect_metrics()
# A tibble: 2 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy binary 0.759
2 roc_auc binary 0.763
collect_predictions()
yardstick functionslast_fit_results <- logistic_last_fit %>%
collect_predictions()
last_fit_results
# A tibble: 332 x 6
id .pred_yes .pred_no .row .pred_class purchased
<chr> <dbl> <dbl> <int> <fct> <fct>
1 train/test split 0.134 0.866 2 no no
2 train/test split 0.729 0.271 17 yes yes
3 train/test split 0.133 0.867 21 no no
4 train/test split 0.0916 0.908 22 no no
5 train/test split 0.598 0.402 24 yes yes
# ... with 327 more rows
The metric_set() function
accuracy(), sens(), and spec()truth and estimate argumentsroc_auc()truth and column of estimated probabilities
The custom_metrics() function will need all three, with .pred_yes as the last argument
custom_metrics <- metric_set(accuracy, sens,
spec, roc_auc)
custom_metrics(last_fit_results,
truth = purchased,
estimate = .pred_class,
.pred_yes)
# A tibble: 4 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy binary 0.759
2 sens binary 0.617
3 spec binary 0.840
4 roc_auc binary 0.763
Modeling with tidymodels in R