Cox PH model with constant covariates

Machine Learning for Marketing Analytics in R

Verena Pflieger

Data Scientist at INWT Statistics

Model assumptions

Model definition: $\lambda(t|x) = \lambda(t)*\text{exp}(x'\beta)$

No shape of underlying hazard $\lambda(t)$ assumed

Relative hazard function $exp(x'\beta)$ constant over time

Machine Learning for Marketing Analytics in R

Fitting a survival model

library(rms)
units(dataSurv$tenure) <- "Month"
dd <- datadist(dataSurv)
options(datadist = "dd")

fitCPH1 <- cph(Surv(tenure, churn) ~ gender + SeniorCitizen + Partner + Dependents + StreamMov + PaperlessBilling + PayMeth + MonthlyCharges, data = dataSurv, x = TRUE, y = TRUE, surv = TRUE, time.inc = 1)
Machine Learning for Marketing Analytics in R
Cox Proportional Hazards Model
  cph(formula = Surv(tenure, churn) ~ gender + ..., data = dataSurv,
  x = TRUE, y = TRUE, surv = TRUE, time.inc = 1)
                      Model Tests        Discrimination   
                                            Indexes        
 Obs       5311    LR chi2    1366.98    R2       0.228    
 Events    1869    d.f.            11    Dxy      0.496    
 Center -0.3964    Pr(> chi2)  0.0000    g        1.125    
                   Score chi2 1355.12    gr       3.082    
                   Pr(> chi2)  0.0000                      

                          Coef    S.E.   Wald Z Pr(>|Z|)
 gender=Male              -0.0326 0.0464  -0.70 0.4817  
 SeniorCitizen=Yes         0.2066 0.0556   3.71 0.0002  
 Partner=Yes              -0.7433 0.0545 -13.65 <0.0001 
 Dependents=Yes           -0.2072 0.0681  -3.04 0.0023  
 StreamMov=NoIntServ      -1.4504 0.1168 -12.41 <0.0001 
 StreamMov=Yes            -0.4139 0.0556  -7.44 <0.0001 
 PaperlessBilling=Yes      0.4056 0.0563   7.21 <0.0001 
 PayMeth=CreditCard(auto) -0.0889 0.0905  -0.98 0.3264  
 ...
Machine Learning for Marketing Analytics in R

Interpretation of coefficients

exp(fitCPH1$coefficients)
             gender=Male        SeniorCitizen=Yes 
               0.9679156                1.2294357 
             Partner=Yes           Dependents=Yes 
               0.4755412                0.8128759 
     StreamMov=NoIntServ            StreamMov=Yes 
               0.2344695                0.6610708 
    PaperlessBilling=Yes PayMeth=CreditCard(auto) 
               1.5001646                0.9149822 
      PayMeth=ElektCheck      PayMeth=MailedCheck 
               3.1168997                2.1814381 
          MonthlyCharges 
               0.9942395
Machine Learning for Marketing Analytics in R

Survival probabilities by MonthlyCharges

survplot(fitCPH1, MonthlyCharges, label.curves = list(keys = 1:5))

Machine Learning for Marketing Analytics in R

Survival probabilities by Partner

survplot(fitCPH1, Partner)

Machine Learning for Marketing Analytics in R

Visualization of hazard ratios

plot(summary(fitCPH1), log = TRUE)

Machine Learning for Marketing Analytics in R

Let's practice!

Machine Learning for Marketing Analytics in R

Preparing Video For Download...