How to build a GLM?

Generalized Linear Models in Python

Ita Cirovic Donev

Data Science Consultant

Components of the GLM

Formula representation for the random component

Generalized Linear Models in Python

Components of the GLM

Formula representation for the random and systematic component

Generalized Linear Models in Python

Components of the GLM

Formula representation for the random and systematic component with the illustration of the interaction effect.

Generalized Linear Models in Python

Components of the GLM

Formula representation for the random and systematic component with the illustration of the interaction and curvilinear effect.

Generalized Linear Models in Python

Components of the GLM

Formula representation for the random and systematic component and the link function

Generalized Linear Models in Python

Continuous $\rightarrow$ Linear Regression

Distribution plot of the continuous random variable.

Data type: continuous
Domain: $(-\infty,\infty)$
Examples: house price, salary, person's height

Family: Gaussian()
Link: identity
$g(\mu) = \mu = E(y)$

Model = Linear regression

Generalized Linear Models in Python

Binary $\rightarrow$ Logistic regression

Distribution plot of the binary random variable.

Data type: binary
Domain: $0,1$
Examples: True/False

Family: Binomial()
Link: logit

Model = Logistic regression

Generalized Linear Models in Python

Count $\rightarrow$ Poisson regression

Distribution plot of the count data.

Data type: count
Domain: $0, 1, 2, ..., \infty$
Examples: number of votes, number of hurricanes

Family: Poisson()
Link: logarithm

Model = Poisson regression

Generalized Linear Models in Python

Link functions

Density Link: $\eta=g(\mu)$ Default link glm(family=...)
Normal $\eta = \mu$ identity Gaussian()
Poisson $\eta = log(\mu)$ logarithm Poisson()
Binomial $\eta = log[p/(1-p)]$ logit Binomial()
Gamma $\eta = 1/\mu$ inverse Gamma()
Inverse Gaussian $\eta = 1/\mu^2$ inverse squared InverseGaussian()
Generalized Linear Models in Python

Benefits of GLMs

  • A unified framework for many different data distributions
    • Exponential family of distributions
  • Link function
    • Transforms the expected value of y
    • Enables linear combinations
    • Many techniques from linear models apply to GLMs as well
Generalized Linear Models in Python

Let's practice

Generalized Linear Models in Python

Preparing Video For Download...