Assumptions of multiple logistic regression

Generalized Linear Models in R

Richard Erickson

Instructor

Assumptions

  • Limitations also apply to Poisson and other GLMs
  • Important assumptions:
    • Simpson's paradox
    • Linear, monotonic
    • Independence
    • Overdispersion
Generalized Linear Models in R

Example Simpson's paradox

Example of line plotted demonstrating what occurs when correct group ignored.

Example of two lines plotted demonstrating Simpson's paradox.

Generalized Linear Models in R

Simpson's paradox

Key points

  • Missing important predictor
  • Inclusion changes outcome
  • Easy to visualize with lm()
Generalized Linear Models in R

Simpson's paradox and admission data

Admissions data

  • University of California Berkeley
  • Graduate admission
  • Rate of admission by department and gender
  • Does bias exist?
Generalized Linear Models in R

Example plot of linear and monotonic data, non-linear monotonic data, and non-linear, non-monotonic data

Generalized Linear Models in R

Independence

Predictors

  • If all independent, order has no effect on estimates
  • If non-independent, order can change estimates

Response

  • What is unit of focus?
  • Individual, groups, group of groups?
  • Test scores
    • Individual student?
    • Teacher? School? District?
Generalized Linear Models in R

Overdispersion

  • Too many zeros or one (Binomial)
  • Too many zeros, too large variance (Poisson)
  • Variance changes
  • Beyond scope of this course
Generalized Linear Models in R

Let's practice!

Generalized Linear Models in R

Preparing Video For Download...