Machine Learning for Marketing Analytics in R
Verena Pflieger
Data Scientist at INWT Statistics
1) Probability to churn
$$ P(Y=1)$$
2) Log odds
$$ \log \frac{P(Y=1)}{P(Y=0)} = \beta_0 + \sum\limits_{p=1}^P \beta_p x_p $$
3) Odds
$$ {\frac{P(Y=1)}{P(Y=0)} = e^Z ,\text{ with} \quad Z = \beta_0 + \sum\limits_{p=1}^P \beta_p x_p } $$
4) Probability to churn
$$ P(Y=1) = \frac{e^Z}{1 + e^Z} $$
## 'data.frame': 45236 obs. of 21 variables:
## $ ID : Factor w/ 45236 levels "1","3","5","7",..
## $ orderDate : Date, format: "2014-12-23" "2014-09-10" ....
## $ title : Factor w/ 4 levels "Mr","Company",..: 1 1 1 ...
## $ newsletter : Factor w/ 2 levels "No","Yes": 0 0 0 1 ...
## $ websiteDesign : Factor w/ 3 levels "1","2","3": 2 1 1 3 ...
## $ paymentMethod : Factor w/ 4 levels "Cash","Credit Card",..: 3 4 ...
## $ couponDiscount: Factor w/ 2 levels "No","Yes": 1 0 0 0 0 1 ...
...
## $ returnCustomer: Factor w/ 2 levels "No","Yes": 0 0 0 0 ...
ggplot(churnData, aes(x = returnCustomer)) +
geom_histogram(stat = "count")
Machine Learning for Marketing Analytics in R