Modeling with tidymodels in R
David Svancer
Data Scientist
Predicting hwy
using cty
as a predictor
$$hwy = \beta_{0} + \beta_{1} cty$$
Model parameters
Predicting hwy
using cty
as a predictor
$$hwy = \beta_{0} + \beta_{1} cty$$
Model parameters
Estimated paramters from training data
$$\small hwy = 0.77 + 1.35(cty)$$
Model formulas in parsnip
General form
outcome ~ predictor_1 + predictor_2 + ...
Shorthand notation
outcome ~ .
Predicting hwy
using cty
as a predictor variable
hwy ~ cty
Unified syntax for model specification in R
Specify the model type
Specify the engine
Specify the mode
Define model specification with parsnip
linear_reg()
Pass lm_model
to the fit()
function
data
to use for model fitting
lm_model <- linear_reg() %>%
set_engine('lm') %>%
set_mode('regression')
lm_fit <- lm_model %>%
fit(hwy ~ cty, data = mpg_training)
The tidy()
function
parsnip
model objectterm
and estimate
column provide estimated parameters
tidy(lm_fit)
# A tibble: 2 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.769 0.528 1.46 1.47e- 1
2 cty 1.35 0.0305 44.2 6.32e-97
Pass trained parsnip
model to the predict()
function
new_data
specifies dataset on which to predict new values
Standardized output from predict()
new_data
input.pred
hwy_predictions <- lm_fit %>% predict(new_data = mpg_test)
hwy_predictions
# A tibble: 57 x 1
.pred
<dbl>
1 25.0
2 27.7
3 25.0
4 25.0
5 22.3
# ... with 47 more rows
The bind_cols()
function
Steps
hwy
and cty
from mpg_test
bind_cols()
and add predictions columnmpg_test_results <- mpg_test %>% select(hwy, cty) %>%
bind_cols(hwy_predictions) mpg_test_results
# A tibble: 57 x 3
hwy cty .pred
<int> <int> <dbl>
1 29 18 25.0
2 31 20 27.7
3 27 18 25.0
4 26 18 25.0
5 25 16 22.3
# ... with 47 more rows
Modeling with tidymodels in R