Working with model objects

Introduction to Regression in R

Richie Cotton

Data Evangelist at DataCamp

coefficients()

mdl_mass_vs_length <- lm(mass_g ~ length_cm, data = bream)

Call:
lm(formula = mass_g ~ length_cm, data = bream)

Coefficients:
(Intercept)    length_cm  
   -1035.35        54.55

coefficients(mdl_mass_vs_length)

(Intercept)   length_cm 
-1035.34757    54.54998

fitted()

fitted values: predictions on the original dataset

fitted(mdl_mass_vs_length)

or equivalently

explanatory_data <- bream %>% 
  select(length_cm)

predict(mdl_mass_vs_length, explanatory_data)

        1         2         3         4         5 
 230.2120  273.8520  268.3970  399.3169  410.2269 
        6         7         8         9        10 
 426.5919  426.5919  470.2319  470.2319  519.3269 
       11        12        13        14        15 
 513.8719  530.2369  552.0569  573.8769  568.4219 
       16        17        18        19        20 
 568.4219  622.9719  622.9719  650.2468  655.7018 
       21        22        23        24        25 
 672.0668  677.5218  682.9768  699.3418  704.7968 
       26        27        28        29        30 
 699.3418  710.2518  748.4368  753.8918  792.0768 
       31        32        33        34        35 
 873.9018  873.9018  939.3617 1004.8217 1037.5517

residuals()

Residuals: actual response values minus predicted response values

residuals(mdl_mass_vs_length)

or equivalently

bream$mass_g - fitted(mdl_mass_vs_length)

       1        2        3        4        5 
  11.788   16.148   71.603  -36.317   19.773 
       6        7        8        9       10 
  23.408   73.408  -80.232  -20.232  -19.327 
      11       12       13       14       15 
 -38.872  -30.237  -52.057 -233.877   31.578 
      16       17       18       19       20 
  31.578   77.028   77.028  -40.247   -5.702 
      21       22       23       24       25 
 -97.067    7.478  -62.977  -19.342   -4.797 
      26       27       28       29       30 
  25.658    9.748  -34.437   96.108  207.923 
      31       32       33       34       35 
  46.098   81.098  -14.362  -29.822  -87.552

summary()

summary(mdl_mass_vs_length)

Call:
lm(formula = mass_g ~ length_cm, data = bream)

Residuals:
   Min     1Q Median     3Q    Max 
-233.9  -35.4   -4.8   31.6  207.9 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1035.35     107.97   -9.59  4.6e-11 ***
length_cm      54.55       3.54   15.42  < 2e-16 ***

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 74.2 on 33 degrees of freedom
Multiple R-squared:  0.878,    Adjusted R-squared:  0.874 
F-statistic:  238 on 1 and 33 DF,  p-value: <2e-16

summary(): call

Call:
lm(formula = mass_g ~ length_cm, data = bream)

summary(): residuals

Residuals:
   Min     1Q Median     3Q    Max 
-233.9  -35.4   -4.8   31.6  207.9

summary(): coefficients

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1035.35     107.97   -9.59  4.6e-11 ***
length_cm      54.55       3.54   15.42  < 2e-16 ***

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(): model metrics

Residual standard error: 74.2 on 33 degrees of freedom
Multiple R-squared:  0.878,    Adjusted R-squared:  0.874 
F-statistic:  238 on 1 and 33 DF,  p-value: <2e-16

tidy()

library(broom)

tidy(mdl_mass_vs_length)

# A tibble: 2 x 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)  -1035.     108.       -9.59 4.58e-11
2 length_cm       54.5      3.54     15.4  1.22e-16

augment()

augment(mdl_mass_vs_length)

# A tibble: 35 × 8
   mass_g length_cm .fitted .resid   .hat .sigma .cooksd .std.resid
    <dbl>     <dbl>   <dbl>  <dbl>  <dbl>  <dbl>   <dbl>      <dbl>
 1    242      23.2    230.   11.8 0.144    75.3 0.00247      0.172
 2    290      24      274.   16.1 0.119    75.2 0.00364      0.232
 3    340      23.9    268.   71.6 0.122    74.1 0.0738       1.03 
 4    363      26.3    399.  -36.3 0.0651   75.0 0.00894     -0.507
 5    430      26.5    410.   19.8 0.0616   75.2 0.00248      0.275
 6    450      26.8    427.   23.4 0.0566   75.2 0.00317      0.325
 7    500      26.8    427.   73.4 0.0566   74.1 0.0311       1.02 
 8    390      27.6    470.  -80.2 0.0452   73.9 0.0291      -1.11 
 9    450      27.6    470.  -20.2 0.0452   75.2 0.00185     -0.279
10    500      28.5    519.  -19.3 0.0360   75.2 0.00132     -0.265
# ... with 25 more rows

glance()

glance(mdl_mass_vs_length)

# A tibble: 1 × 12
  r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
      <dbl>         <dbl> <dbl>     <dbl>    <dbl> <dbl>  <dbl> <dbl> <dbl>
1     0.878         0.874  74.2      238. 1.22e-16     1  -199.  405.  409.
# ... with 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>

Let's practice!

Introduction to Regression in R