Modellazione con tidymodels in R
David Svancer
Data Scientist
La funzione last_fit()
Come con fit(), i primi passi includono:
rsampleparsnipleads_split <- initial_split(leads_df, strata = purchased)logistic_model <- logistic_reg() %>% set_engine('glm') %>% set_mode('classification')
La funzione last_fit()
parsnip
La funzione collect_metrics() calcola le metriche sul test set
logistic_last_fit <- logistic_model %>% last_fit(purchased ~ total_visits + total_time, split = leads_split)logistic_last_fit %>% collect_metrics()
# A tibble: 2 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy binary 0.759
2 roc_auc binary 0.763
collect_predictions()
yardsticklast_fit_results <- logistic_last_fit %>%
collect_predictions()
last_fit_results
# A tibble: 332 x 6
id .pred_yes .pred_no .row .pred_class purchased
<chr> <dbl> <dbl> <int> <fct> <fct>
1 train/test split 0.134 0.866 2 no no
2 train/test split 0.729 0.271 17 yes yes
3 train/test split 0.133 0.867 21 no no
4 train/test split 0.0916 0.908 22 no no
5 train/test split 0.598 0.402 24 yes yes
# ... with 327 more rows
La funzione metric_set()
accuracy(), sens() e spec()truth ed estimateroc_auc()truth e una colonna di probabilità stimate
La funzione custom_metrics() richiederà tutte e tre, con .pred_yes come ultimo argomento
custom_metrics <- metric_set(accuracy, sens,
spec, roc_auc)
custom_metrics(last_fit_results,
truth = purchased,
estimate = .pred_class,
.pred_yes)
# A tibble: 4 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy binary 0.759
2 sens binary 0.617
3 spec binary 0.840
4 roc_auc binary 0.763
Modellazione con tidymodels in R