Marketing Analytics: Predicting Customer Churn in Python
Mark Peterson
Director of Data Science, Infoblox
telco.dtypes
Account_Length int64
Vmail_Message int64
Day_Mins float64
Eve_Mins float64
Night_Mins float64
Intl_Mins float64
CustServ_Calls int64
Churn object
Intl_Plan object
Vmail_Plan object
Day_Calls int64
Day_Charge float64
Eve_Calls int64
Eve_Charge float64
Night_Calls int64
Night_Charge float64
Intl_Calls int64
Intl_Charge float64
State object
Area_Code int64
Phone object
dtype: object
telco['Intl_Plan'].head()
0 no
1 no
2 no
3 yes
4 yes
Name: Intl_Plan, dtype: object
Option 1: .replace()
telco['Intl_Plan'].replace({'no':0 , 'yes':1})
telco['Intl_Plan'].head()
0 0
1 0
2 0
3 1
4 1
Name: Intl_Plan, dtype: int64
Option 2: LabelEncoder()
from sklearn.preprocessing import LabelEncoder
LabelEncoder().fit_transform(telco["Intl_Plan"])
telco['Intl_Plan'].head()
0 0
1 0
2 0
3 1
4 1
Name: Intl_Plan, dtype: int64
telco['State'].head(4)
0 KS
1 OH
2 NJ
3 OH
Name: State, dtype: object
0 0
1 1
2 2
3 1
Name: State, dtype: int64
telco['Intl_Calls'].describe()
count 3333.000000
mean 4.479448
std 2.461214
min 0.000000
25% 3.000000
50% 4.000000
75% 6.000000
max 20.000000
Name: Intl_Calls, dtype: float64
telco['Night_Mins'].describe()
count 3333.000000
mean 200.872037
std 50.573847
min 23.200000
25% 167.000000
50% 201.200000
75% 235.300000
max 395.000000
Name: Night_Mins, dtype: float64
from sklearn.preprocessing import StandardScaler
df = StandardScaler().fit_transform(df)
Marketing Analytics: Predicting Customer Churn in Python