Formulas in R

Generalized Linear Models in R

Richard Erickson

Instructor

Why care about formulas for multiple logistic regression?

  • Formulas backbone of regression
  • Tricky to figure out
  • Understanding model.matrix() key
Generalized Linear Models in R

Slopes

  • Estimates coefficient for continuous variable
    • e.g., height = c(72.3, 21.1, 3.7, 1.0)
  • Formula also requires a global intercept
  • Multiple slopes: Slope for each predictor
Generalized Linear Models in R

Intercepts

  • Discrete groups used to predict
  • factor or character in R: fish = c("red", "blue")
  • Single intercept has two options:
    • Reference intercept + contrast: y ~ x
    • Intercept for each group: y ~ x -1
Generalized Linear Models in R

Multiple intercepts

  • Estimates effect of each group compared to reference group
  • The first group, alphabetically, in the factor
  • Default has one reference group per variable
    • y ~ x1 + x2
  • Can specify one group to estimate an intercept for all groups
    • y ~ x1+ x2 - 1
  • First variable has intercept estimated for each group
Generalized Linear Models in R

Dummy variables

  • Codes group membership
  • Used under the hood (i.e., model.matrix())
  • 0s and 1s for each group
  • Example input: color = c("red", "blue")
  • Dummy variables for y ~ colors:
    • intercept = c(1, 1)
    • blue = c(0, 1)
  • Dummy variables for y ~ colors - 1 :
    • red = c(1, 0)
    • blue = c(0, 1)
Generalized Linear Models in R

model.matrix()

  • model.matrix() does legwork for us
  • Foundation for formulas in R
model.matrix( ~ colors)
  (Intercept) colorsred
1           1         1
2           1         0
attr(,"assign")
[1] 0 1
attr(,"contrasts")
attr(,"contrasts")$colors
"contr.treatment"
  • Order determined by factor order
  • Change order change with Tidyverse or factor()
Generalized Linear Models in R

Factor vs numeric caveat

  • R thinks variable is numeric
    • e.g., month = c(1, 2, 3)
month <- c( 1, 2, 3)
model.matrix( ~ month)
  (Intercept) month
1           1     1
2           1     2
3           1     3
attr(,"assign")
0 1
  • Need to specify factor or character
    • e.g., month = factor(c( 1, 2, 3))
model.matrix( ~ month)
  (Intercept) month2 month3
1           1      0      0
2           1      1      0
3           1      0      1
attr(,"assign")
0 1 1
attr(,"contrasts")$month
"contr.treatment"
Generalized Linear Models in R

Let's practice!

Generalized Linear Models in R

Preparing Video For Download...