Dummy variables, missing data, and interactions

Supervised Learning in R: Classification

Brett Lantz

Instructor

Dummy coding categorical data

# create gender factor
my_data$gender <- factor(my_data$gender,
                         levels = c(0, 1, 2),
                         labels = c("Male", "Female", "Other"))
Supervised Learning in R: Classification

Imputing missing data

Same AUC, Different ROC

Same AUC, Different ROC

Supervised Learning in R: Classification

Interaction effects

Same AUC, Different ROC

Same AUC, Different ROC

# interaction of obesity and smoking
glm(disease ~ obesity * smoking,
      data = health,
      family = "binomial")
Supervised Learning in R: Classification

Let's practice!

Supervised Learning in R: Classification

Preparing Video For Download...