Linear regression - the fundamental method

Supervised Learning in R: Regression

Nina Zumel and John Mount

Win-Vector LLC

Linear Regression

$$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ...$$

  • $y$ is linearly related to each $x_i$
  • Each $x_i$ contributes additively to $y$
Supervised Learning in R: Regression

Linear Regression in R: lm()

cmodel <- lm(temperature ~ chirps_per_sec, data = cricket)
  • formula: temperature ~ chirps_per_sec
  • data frame: cricket
Supervised Learning in R: Regression

Formulas

fmla_1 <- temperature ~ chirps_per_sec
fmla_2 <- blood_pressure ~ age + weight
  • LHS: outcome
  • RHS: inputs
    • use + for multiple inputs
fmla_1 <- as.formula("temperature ~ chirps_per_sec")
Supervised Learning in R: Regression

Looking at the Model

$$ y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... $$

cmodel
Call:
lm(formula = temperature ~ chirps_per_sec, data = cricket)

Coefficients:
   (Intercept)  chirps_per_sec  
        25.232           3.291
Supervised Learning in R: Regression

More Information about the Model

summary(cmodel)
Call:
lm(formula = fmla, data = cricket)

Residuals:
   Min     1Q Median     3Q    Max 
-6.515 -1.971  0.490  2.807  5.001 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)     25.2323    10.0601   2.508 0.026183 *  
chirps_per_sec   3.2911     0.6012   5.475 0.000107 ***

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.829 on 13 degrees of freedom
Multiple R-squared:  0.6975, Adjusted R-squared:  0.6742 
F-statistic: 29.97 on 1 and 13 DF,  p-value: 0.0001067
Supervised Learning in R: Regression

More Information about the Model

broom::glance(cmodel)

sigr::wrapFTest(cmodel)
Supervised Learning in R: Regression

Let's practice!

Supervised Learning in R: Regression

Preparing Video For Download...