Machine Learning for Marketing Analytics in R
Verena Pflieger
Data Scientist at INWT Statistics
Model definition: $\lambda(t|x) = \lambda(t)*\text{exp}(x'\beta)$
No shape of underlying hazard $\lambda(t)$ assumed
Relative hazard function $exp(x'\beta)$ constant over time
library(rms) units(dataSurv$tenure) <- "Month" dd <- datadist(dataSurv) options(datadist = "dd")
fitCPH1 <- cph(Surv(tenure, churn) ~ gender + SeniorCitizen + Partner + Dependents + StreamMov + PaperlessBilling + PayMeth + MonthlyCharges, data = dataSurv, x = TRUE, y = TRUE, surv = TRUE, time.inc = 1)
Cox Proportional Hazards Model
cph(formula = Surv(tenure, churn) ~ gender + ..., data = dataSurv,
x = TRUE, y = TRUE, surv = TRUE, time.inc = 1)
Model Tests Discrimination
Indexes
Obs 5311 LR chi2 1366.98 R2 0.228
Events 1869 d.f. 11 Dxy 0.496
Center -0.3964 Pr(> chi2) 0.0000 g 1.125
Score chi2 1355.12 gr 3.082
Pr(> chi2) 0.0000
Coef S.E. Wald Z Pr(>|Z|)
gender=Male -0.0326 0.0464 -0.70 0.4817
SeniorCitizen=Yes 0.2066 0.0556 3.71 0.0002
Partner=Yes -0.7433 0.0545 -13.65 <0.0001
Dependents=Yes -0.2072 0.0681 -3.04 0.0023
StreamMov=NoIntServ -1.4504 0.1168 -12.41 <0.0001
StreamMov=Yes -0.4139 0.0556 -7.44 <0.0001
PaperlessBilling=Yes 0.4056 0.0563 7.21 <0.0001
PayMeth=CreditCard(auto) -0.0889 0.0905 -0.98 0.3264
...
exp(fitCPH1$coefficients)
gender=Male SeniorCitizen=Yes
0.9679156 1.2294357
Partner=Yes Dependents=Yes
0.4755412 0.8128759
StreamMov=NoIntServ StreamMov=Yes
0.2344695 0.6610708
PaperlessBilling=Yes PayMeth=CreditCard(auto)
1.5001646 0.9149822
PayMeth=ElektCheck PayMeth=MailedCheck
3.1168997 2.1814381
MonthlyCharges
0.9942395
survplot(fitCPH1, MonthlyCharges, label.curves = list(keys = 1:5))
survplot(fitCPH1, Partner)
plot(summary(fitCPH1), log = TRUE)
Machine Learning for Marketing Analytics in R