Survival analysis: introduction

Machine Learning for Marketing Analytics in R

Verena Pflieger

Data Scientist at INWT Statistics

Machine Learning for Marketing Analytics in R

Advantages survival model

  • Less aggregation
  • Allows us to model when an event takes place
  • No arbitrarily set timeframe
  • Deeper insights into customer relations
Machine Learning for Marketing Analytics in R

Machine Learning for Marketing Analytics in R

Data for survival analysis

Classes 'tbl_df', 'tbl' and 'data.frame': 5311 obs. of  11 variables:
 $ customerID      : Factor w/ 7043 levels "0002-ORFBO","0003-MKNFE",.: 2565...
 $ gender          : Factor w/ 2 levels "Female","Male": 2 2 1 1 2 ...
 $ SeniorCitizen   : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 ...
 $ Partner         : Factor w/ 2 levels "No","Yes": 1 1 1 1 2 ...
 $ Dependents      : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 ...
 $ tenure          : num  2 45 2 8 22 28 62 13 16 58 ...
 $ StreamingMovies : Factor w/ 3 levels "No","No internet service",..: 1 1 ...
 $ PaperlessBilling: Factor w/ 2 levels "No","Yes": 2 1 2 2 1 ...
 $ PaymentMethod   : Factor w/ 4 levels "Bank transfer (automatic)", ...: 4...
 $ MonthlyCharges  : num  53.9 42.3 70.7 99.7 89.1 ...
 $ churn           : num  1 0 1 1 0 1 0 0 0 0 ...
Machine Learning for Marketing Analytics in R

Machine Learning for Marketing Analytics in R

Tenure time

library(ggplot2)

plotTenure <- dataSurv %>% 
    mutate(churn = churn %>% factor(labels = c("No", "Yes"))) %>% 

ggplot() +
    geom_histogram(aes(x = tenure,
                 fill = factor(churn))) +
   facet_grid( ~ churn) +
   theme(legend.position = "none")
plotTenure
Machine Learning for Marketing Analytics in R

Machine Learning for Marketing Analytics in R

Let's practice!

Machine Learning for Marketing Analytics in R

Preparing Video For Download...