Modeling with Data in the Tidyverse
Albert Y. Kim
Assistant Professor of Statistical and Data Sciences
Using values in estimate
in regression table below:
# Fit regression model and get regression table
model_price_3 <- lm(log10_price ~ log10_size + condition,
data = house_prices)
get_regression_table(model_price_3)
# A tibble: 6 x 7
term estimate std_error statistic p_value lower_ci...
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>...
1 intercept 2.88 0.036 80.0 0 2.81...
2 log10_size 0.837 0.006 134. 0 0.825...
...
# Create data frame of "new" houses
new_houses <- data_frame(
log10_size = c(2.9, 3.6),
condition = factor(c(3, 4))
)
new_houses
# A tibble: 2 x 2
log10_size condition
<dbl> <fct>
1 2.9 3
2 3.6 4
# Make predictions on new data
get_regression_points(model_price_3,
newdata = new_houses)
# A tibble: 2 x 4
ID log10_size condition log10_price_hat
<int> <dbl> <fct> <dbl>
1 1 2.9 3 5.34
2 2 3.6 4 5.94
# Make predictions in original units by undoing log10()
get_regression_points(model_price_3,
newdata = new_houses) %>%
mutate(price_hat = 10^log10_price_hat)
# A tibble: 2 x 5
ID log10_size condition log10_price_hat price_hat
<int> <dbl> <fct> <dbl> <dbl>
1 1 2.9 3 5.34 219786.
2 2 3.6 4 5.94 870964.
Modeling with Data in the Tidyverse