Introduction to Regression in R
Richie Cotton
Data Evangelist at DataCamp
has_churned | time_since_first_purchase | time_since_last_purchase |
---|---|---|
0 | 0.3993247 | -0.5158691 |
1 | -0.4297957 | 0.6780654 |
0 | 3.7383122 | 0.4082544 |
0 | 0.6032289 | -0.6990435 |
... | ... | ... |
response | length of relationship | recency of activity |
mdl_churn_vs_recency_lm <- lm(has_churned ~ time_since_last_purchase, data = churn)
Call:
lm(formula = has_churned ~ time_since_last_purchase, data = churn)
Coefficients:
(Intercept) time_since_last_purchase
0.49078 0.06378
coeffs <- coefficients(mdl_churn_vs_recency_lm)
intercept <- coeffs[1]
slope <- coeffs[2]
ggplot(
churn,
aes(time_since_last_purchase, has_churned)
) +
geom_point() +
geom_abline(intercept = intercept, slope = slope)
Predictions are probabilities of churn, not amounts of churn.
ggplot(
churn,
aes(days_since_last_purchase, has_churned)
) +
geom_point() +
geom_abline(intercept = intercept, slope = slope) +
xlim(-10, 10) +
ylim(-0.2, 1.2)
glm(has_churned ~ time_since_last_purchase, data = churn, family = gaussian)
Call: glm(formula = has_churned ~ time_since_last_purchase, family = gaussian,
data = churn)
Coefficients:
(Intercept) time_since_last_purchase
0.49078 0.06378
Degrees of Freedom: 399 Total (i.e. Null); 398 Residual
Null Deviance: 100
Residual Deviance: 98.02 AIC: 578.7
mdl_recency_glm <- glm(has_churned ~ time_since_last_purchase, data = churn, family = binomial)
Call: glm(formula = has_churned ~ time_since_last_purchase, family = binomial,
data = churn)
Coefficients:
(Intercept) time_since_last_purchase
-0.03502 0.26921
Degrees of Freedom: 399 Total (i.e. Null); 398 Residual
Null Deviance: 554.5
Residual Deviance: 546.4 AIC: 550.4
ggplot(
churn,
aes(time_since_last_purchase, has_churned)
) +
geom_point() +
geom_abline(
intercept = intercept, slope = slope
) +
geom_smooth(
method = "glm",
se = FALSE,
method.args = list(family = binomial)
)
Introduction to Regression in R