Modelli lineari generalizzati in Python
Ita Cirovic Donev
Data Science Consultant
Importare statsmodels
import statsmodels.api as sm
Supporto per formule
import statsmodels.formula.api as smf
Usa glm() direttamente
from statsmodels.formula.api import glm
Basato su FORMULA
from statsmodels.formula.api import glm
model = glm(formula, data, family)
Basato su ARRAY
import statsmodels.api as sm
X = sm.add_constant(X)
model = sm.glm(y, X, family)
$$\texttt{\color{#00A388}{response}} \sim \texttt{\color{#FF6138}{explanatory variable(s)}}$$ $$\texttt{\color{#00A388}{output}} \sim \texttt{\color{#FF6138}{input(s)}}$$
formula = 'y ~ x1 + x2'
x1 come categoricax1 e x2family = sm.families.____()
Le funzioni family:
Altre famiglie sul sito di statsmodels.
print(model_GLM.summary())
Generalized Linear Model Regression Results
=============================================================================
Dep. Variable: y No. Observations: 173
Model: GLM Df Residuals: 171
Model Family: Binomial Df Model: 1
Link Function: logit Scale: 1.0000
Method: IRLS Log-Likelihood: -97.226
Date: Mon, 21 Jan 2019 Deviance: 194.45
Time: 11:30:01 Pearson chi2: 165.
No. Iterations: 4 Covariance Type: nonrobust
=============================================================================
coef std err z P>|z| [0.025 0.975]
-----------------------------------------------------------------------------
Intercept -12.3508 2.629 -4.698 0.000 -17.503 -7.199
width 0.4972 0.102 4.887 0.000 0.298 0.697
=============================================================================
$\texttt{\color{#007AFF}{.params}}$ mostra i coefficienti
model_GLM.params
Intercept -12.350818
width 0.497231
dtype: float64
$\texttt{\color{#007AFF}{.conf\_int(alpha=0.05, cols=None)}}$ mostra gli intervalli di confidenza
model_GLM.conf_int()
0 1
Intercept -17.503010 -7.198625
width 0.297833 0.696629
model_GLM.predict(test_data)
0 0.029309
1 0.470299
2 0.834983
3 0.972363
4 0.987941
Modelli lineari generalizzati in Python