Intermediate Regression in R
Richie Cotton
Data Evangelist at DataCamp
This course assumes knowledge from Introduction to Regression in R.
Multiple regression is a regression model with more than one explanatory variable.
More explanatory variables can give more insight and better predictions.
mass_g | length_cm | species |
---|---|---|
242.0 | 23.2 | Bream |
5.9 | 7.5 | Perch |
200.0 | 30.0 | Pike |
40.0 | 12.9 | Roach |
mass_g
is the response variablemdl_mass_vs_length <- lm(mass_g ~ length_cm, data = fish)
Call:
lm(formula = mass_g ~ length_cm, data = fish)
Coefficients:
(Intercept) length_cm
-536.2 34.9
mdl_mass_vs_species <- lm(mass_g ~ species + 0, data = fish)
Call:
lm(formula = mass_g ~ species + 0, data = fish)
Coefficients:
speciesBream speciesPerch speciesPike speciesRoach
617.8 382.2 718.7 152.0
mdl_mass_vs_both <- lm(mass_g ~ length_cm + species + 0, data = fish)
Call:
lm(formula = mass_g ~ length_cm + species + 0, data = fish)
Coefficients:
length_cm speciesBream speciesPerch speciesPike speciesRoach
42.57 -672.24 -713.29 -1089.46 -726.78
coefficients(mdl_mass_vs_length)
(Intercept) length_cm
-536.2 34.9
coefficients(mdl_mass_vs_species)
speciesBream speciesPerch speciesPike speciesRoach
617.8 382.2 718.7 152.0
coefficients(mdl_mass_vs_both)
length_cm speciesBream speciesPerch speciesPike speciesRoach
42.57 -672.24 -713.29 -1089.46 -726.78
library(ggplot2)
ggplot(fish, aes(length_cm, mass_g)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE)
ggplot(fish, aes(species, mass_g)) +
geom_boxplot() +
stat_summary(fun.y = mean, shape = 15)
library(moderndive)
ggplot(fish, aes(length_cm, mass_g, color = species)) +
geom_point() +
geom_parallel_slopes(se = FALSE)
Intermediate Regression in R