Generalized Linear Models in Python
Ita Cirovic Donev
Data Science Consultant
Examples:
UNGROUPED
GROUPED
Test outcome: $PASS=1$ or $FAIL=0$
Want to model
$P(y=1)=\beta_0 + \beta_1x_1$
$P(\text{Pass})=\beta_0 + \beta_1 \times \text{Hours of study}$
Test outcome: $PASS=1$ or $FAIL=0$
Want to model
$P(y=1)=\beta_0 + \beta_1x_1$
$P(\text{Pass})=\beta_0 + \beta_1 \times \text{Hours of study}$
$f(z) = \frac{1}{(1+\exp(-z))}$
$$ ODDS = \frac{\text{event occuring}}{\text{event NOT occuring}} $$
$$ \text{ODDS RATIO} = \frac{odds 1}{odds 2} $$
4 games
Odds are 3 to 1
$$ \text{odds} \neq \text{probability} $$
$$ \text{odds} = \frac{\text{probability}}{1-\text{probability}} $$
$$ \text{probability} = \frac{\text{odds}}{1+\text{odds}} $$
Step 1. Probability model
$E(y)=\mu=P(y=1)=\beta_0 + \beta_1x_1$
Step 2. Logistic function
$f(z) = \large{\frac{1}{(1+\exp(-z))}}$
Step 3. Apply logistic function $\rightarrow$ INVERSE-LOGIT
$\mu = \large{\frac{1}{1+\exp(-(\beta_0+\beta_1x_1))}} = \large{\frac{\exp(\beta_0+\beta_1x_1)}{1+\exp(\beta_0+\beta_1x_1)}}$
$1-\mu = \large{\frac{1}{1+\exp(\beta_0+\beta_1x_1)}}$
$$ LOGIT(\mu)=log(\frac{\mu}{1-\mu}) = \beta_0+\beta_1x_1 $$
Function - glm()
model_GLM = glm(formula = 'y ~ x',
data = my_data,
family = sm.families.Binomial()).fit
Input
y = [0,1,1,0,...]
y = ['No','Yes','Yes',...]
y = ['Fail','Pass','Pass',...]
Generalized Linear Models in Python