Comparing models

Generalized Linear Models in Python

Ita Cirovic Donev

Data Science Consultant

Deviance

  • Formula $$ D = -2LL(\beta) $$

  • Measure of error

  • Lower deviance $\rightarrow$ better model fit
  • Benchmark for comparison is the null deviance $\rightarrow$ intercept-only model
  • Evaluate
    • Adding a random noise variable would, on average, decrease deviance by 1
    • Adding $p$ predictors to the model deviance should decrease by more than $p$
Generalized Linear Models in Python

Deviance in Python

Summary output of the model 'y ~ distance100' with highlight on the log-likelihood and the deviance statistic.

Generalized Linear Models in Python

Compute deviance

  • Extract null-deviance and deviance
    # Extract null deviance
    print(model.null_deviance)
    
4118.0992
# Extract model deviance
print(model.deviance)
4076.2378
  • Compute deviance using log likelihood
    print(-2*model.llf)
    
4076.2378
  • Reduction in deviance by 41.86
  • Including distance100 improved the fit
Generalized Linear Models in Python

Model complexity

  • model_1and model_2, where
    • $L1 > L2$
    • Number of parameters higher in model_2
  • model_2is overfitting
Generalized Linear Models in Python

Let's practice!

Generalized Linear Models in Python

Preparing Video For Download...