Overview of logistic regression

Generalized Linear Models in R

Richard Erickson

Instructor

Scientist in laboratory

Generalized Linear Models in R

Sports analytics image

Generalized Linear Models in R

Online seller using logistic regression

Generalized Linear Models in R

Chapter overview

  • Overview of logistic regression
  • Inputs for logistic regression in R
  • Link functions
Generalized Linear Models in R

Why use logistic regression?

  • Binary data: (0/1)
  • Survival data: Alive/dead
  • Choices or behavior: Yes/No, Coke/Pepsi, etc.
  • Result: Pass/fail, Heads/tails, Win/lose etc.
Generalized Linear Models in R

What is logistic regression?

Default GLM for binomial family

Model of binary data

$Y = \text{Binomial}(p)$

Linked to linear equation

$\text{logit}(p) = \beta_0 + \beta_1 x + \epsilon$

Generalized Linear Models in R

Logit function

Logit defined as

$\text{logit}(p) = \text{log}\left(\frac{p}{1-p}\right)$

Inverse logit defined as

$\text{logit}^{-1}(x) = \frac{1}{1 + \text{exp}(-x)}$

Generalized Linear Models in R

How to run logistic regression

Function:

glm(y ~ x, data = dat, family = 'binomial')

Inputs:

y = c(0, 1, 0, 0, 1...)
y = c("yes", "no"...)
y = c("win", "lose"...)
# Or any 2-level factor
Generalized Linear Models in R

Riding the bus?

  • What makes people more likely to commute using a bus?
  • Ride bus: yes, Not-ride bus no
  • Do number of commuting days change the chance of riding the bus?
  • 2015 commuter data from Pittsburgh, PA, USA
  CommuteDays Bus
1           5 Yes
2           2  No
Generalized Linear Models in R

Let's practice!

Generalized Linear Models in R

Preparing Video For Download...