Pemodelan dengan tidymodels di R
David Svancer
Data Scientist
Semua fungsi yardstick membutuhkan tibble hasil model
hwy untuk data mpg.predmpg_test_results
# A tibble: 57 x 3
hwy cty .pred
<int> <int> <dbl>
1 29 18 25.0
2 31 20 27.7
3 27 18 25.0
4 26 18 25.0
5 25 16 22.3
# ... with 47 more rows
RMSE memperkirakan rata-rata galat prediksi
rmse() dari yardsticktruth adalah kolom nilai outcome aktualestimate adalah kolom nilai outcome terprediksimpg_test_results %>%
rmse(truth = hwy, estimate = .pred)
# A tibble: 1 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 rmse standard 1.93
Mengukur kuadrat korelasi antara nilai aktual dan prediksi
rsq() dari yardstickmpg_test_results %>%
rsq(truth = hwy, estimate = .pred)
# A tibble: 1 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 rsq standard 0.904
Visualisasi metrik R squared
Membuat plot R squared dengan ggplot2
geom_point()geom_abline()coord_obs_pred()ggplot(mpg_test_results, aes(x = hwy, y = .pred)) +geom_point() +geom_abline(color = 'blue', linetype = 2) +coord_obs_pred() + labs(title = 'R-Squared Plot', y = 'Predicted Highway MPG', x = 'Actual Highway MPG')
Fungsi last_fit()
lm_last_fit <- lm_model %>%
last_fit(hwy ~ cty,
split = mpg_split)
Fungsi collect_metrics()
last_fit()lm_last_fit %>%
collect_metrics()
# A tibble: 2 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 rmse standard 1.93
2 rsq standard 0.904
Fungsi collect_predictions()
last_fit().predlm_last_fit %>%
collect_predictions()
# A tibble: 57 x 4
id .pred .row hwy
<chr> <dbl> <int> <int>
1 train/test split 25.0 1 29
2 train/test split 27.7 3 31
3 train/test split 25.0 7 27
4 train/test split 25.0 8 26
5 train/test split 22.3 9 25
# ... with 47 more rows
Pemodelan dengan tidymodels di R