Logistic Regression Models

Machine Learning in the Tidyverse

Dmitriy (Dima) Gorenshteyn

Lead Data Scientist, Memorial Sloan Kettering Cancer Center

Binary Classification

Machine Learning in the Tidyverse

The attrition Dataset

Machine Learning in the Tidyverse

Logistic Regression

glm(formula = ___, data = ___, family = "binomial")
Machine Learning in the Tidyverse

glm()

head(cv_data)
# A tibble: 5 x 4
  splits       id    train                   validate               
* <list>       <chr> <list>                  <list>                 
1 <S3: rsplit> Fold1 <data.frame [882 × 31]> <data.frame [221 × 31]>
2 <S3: rsplit> Fold2 <data.frame [882 × 31]> <data.frame [221 × 31]>
3 <S3: rsplit> Fold3 <data.frame [882 × 31]> <data.frame [221 × 31]>
4 <S3: rsplit> Fold4 <data.frame [883 × 31]> <data.frame [220 × 31]>
5 <S3: rsplit> Fold5 <data.frame [883 × 31]> <data.frame [220 × 31]>
cv_models_lr <- cv_data %>% 
  mutate(model = map(train, ~glm(formula = Attrition~., 
                                 data = .x, family = "binomial")))
Machine Learning in the Tidyverse

Time to Practice

Machine Learning in the Tidyverse

Preparing Video For Download...