Effect of an outlier

Inference for Linear Regression in R

Jo Hardin

Professor, Pomona College

Inference for Linear Regression in R

Different regression lines

Inference for Linear Regression in R

Inference for Linear Regression in R

Different regression models

starbucks_lowFib <- starbucks %>% filter(Fiber < 15)
lm(Protein ~ Fiber, data = starbucks) %>% tidy()
         term estimate std.error statistic      p.value
1 (Intercept) 7.526138 0.9924180  7.583637 1.101756e-11
2       Fiber 1.383684 0.2451395  5.644476 1.286752e-07
lm(Protein ~ Fiber, data = starbucks_lowFib) %>% tidy()
         term estimate std.error statistic      p.value
1 (Intercept) 6.537053 1.0633640  6.147521 1.292803e-08
2       Fiber 1.796844 0.2995901  5.997675 2.600224e-08
Inference for Linear Regression in R

Different regression randomization tests

Full dataset
perm_slope %>% mutate(
    abs_perm_slope = abs(stat)) %>%
  summarize(
    p_value = mean(
      abs_perm_slope > abs(obs_slope)
    )
  )
 A tibble: 1 x 1
  p_value
    <dbl>
1       0
Low fiber dataset
perm_slope_lowFib %>% mutate(
  abs_perm_slope = abs(stat)) %>%
  summarize(
    p_value = mean(
      abs_perm_slope > abs(obs_slope_lowFib)
     )
   )
 A tibble: 1 x 1
  p_value
    <dbl>
1       0
Inference for Linear Regression in R

Let's practice!

Inference for Linear Regression in R

Preparing Video For Download...