One model with an interaction

Intermediate Regression with statsmodels in Python

Maarten Van den Broeck

Content Developer at DataCamp

What is an interaction?

In the fish dataset

  • Different fish species have different mass to length ratios.
  • The effect of length on the expected mass is different for different species.

 

More generally

The effect of one explanatory variable on the expected response changes depending on the value of another explanatory variable.

Intermediate Regression with statsmodels in Python

Specifying interactions

No interactions

response ~ explntry1 + explntry2

With interactions (implicit)

response_var ~ explntry1 * explntry2

With interactions (explicit)

response ~ explntry1 + explntry2 + explntry1:explntry2

No interactions

mass_g ~ length_cm + species

With interactions (implicit)

mass_g ~ length_cm * species

With interactions (explicit)

mass_g ~ length_cm + species + length_cm:species
Intermediate Regression with statsmodels in Python

Running the model

mdl_mass_vs_both = ols("mass_g ~ length_cm * species", data=fish).fit()

print(mdl_mass_vs_both.params)
Intercept                    -1035.3476
species[T.Perch]               416.1725
species[T.Pike]               -505.4767
species[T.Roach]               705.9714
length_cm                       54.5500
length_cm:species[T.Perch]     -15.6385
length_cm:species[T.Pike]       -1.3551
length_cm:species[T.Roach]     -31.2307
Intermediate Regression with statsmodels in Python

Easier to understand coefficients

mdl_mass_vs_both_inter = ols("mass_g ~ species + species:length_cm + 0", data=fish).fit()

print(mdl_mass_vs_both_inter.params)
species[Bream]             -1035.3476
species[Perch]              -619.1751
species[Pike]              -1540.8243
species[Roach]              -329.3762
species[Bream]:length_cm      54.5500
species[Perch]:length_cm      38.9115
species[Pike]:length_cm       53.1949
species[Roach]:length_cm      23.3193
Intermediate Regression with statsmodels in Python

Familiar numbers

print(mdl_mass_vs_both_inter.params)
species[Bream]             -1035.3476
species[Perch]              -619.1751
species[Pike]              -1540.8243
species[Roach]              -329.3762
species[Bream]:length_cm      54.5500
species[Perch]:length_cm      38.9115
species[Pike]:length_cm       53.1949
species[Roach]:length_cm      23.3193
print(mdl_bream.params)
Intercept   -1035.3476
length_cm      54.5500
Intermediate Regression with statsmodels in Python

Let's practice!

Intermediate Regression with statsmodels in Python

Preparing Video For Download...