Meervoudige logistische regressie

Intermediary Regression in R

Richie Cotton

Data Evangelist at DataCamp

Bank-churn-dataset

has_churned time_since_first_purchase time_since_last_purchase
0 0.3993247 -0.5158691
1 -0.4297957 0.6780654
0 3.7383122 0.4082544
0 0.6032289 -0.6990435
... ... ...
response duur relatie recente activiteit
1 https://www.rdocumentation.org/packages/bayesQR/topics/Churn
Intermediary Regression in R

glm()

glm(response ~ explanatory, data = dataset, family = binomial)
glm(response ~ explanatory1 + explanatory2, data = dataset, family = binomial)
glm(response ~ explanatory1 * explanatory2, data = dataset, family = binomial)
Intermediary Regression in R

Voorspellingsflow

explanatory_data <- expand_grid(
  explanatory1 = some_values,
  explanatory2 = some_values
)
prediction_data <- explanatory_data %>% 
  mutate(
    has_churned = predict(mdl, explanatory_data, type = "response")
  )
Intermediary Regression in R

De vier uitkomsten

feitelijk onwaar feitelijk waar
voorspeld onwaar correct false negative
voorspeld waar false positive correct
1 https://campus.datacamp.com/courses/introduction-to-regression-in-r/simple-logistic-regression?ex=10
Intermediary Regression in R

Confusion matrix

actual_response <- dataset$response
predicted_response <- round(fitted(mdl))
outcomes <- table(predicted_response, actual_response)
confusion <- conf_mat(outcomes)
autoplot(confusion)
summary(confusion, event_level = "second")
Intermediary Regression in R

Visualisatie

  • Gebruik facetten voor categorische variabelen.
  • Bij 2 numerieke verklarende variabelen: kleur voor de respons.
  • Geef responsen onder 0.5 één kleur; boven 0.5 een andere.
scale_color_gradient2(midpoint = 0.5)
Intermediary Regression in R

Laten we oefenen!

Intermediary Regression in R

Preparing Video For Download...