Feature selection and engineering

Marketing Analytics: Predicting Customer Churn in Python

Mark Peterson

Director of Data Science, Infoblox

Dropping unnecessary features

  • Unique identifiers
    • Phone numbers
    • Social security numbers
    • Account numbers
  • .drop() method
telco.drop(['Soc_Sec', 'Tax_ID'], axis=1)
Marketing Analytics: Predicting Customer Churn in Python

Dropping correlated features

  • Highly correlated features can be dropped
  • They provide no additional information to the model
Marketing Analytics: Predicting Customer Churn in Python
telco.corr()

correlation-1

Marketing Analytics: Predicting Customer Churn in Python
telco.corr()

corr_day_mins.png

Marketing Analytics: Predicting Customer Churn in Python
telco.corr()

corr_eve_mins.png

Marketing Analytics: Predicting Customer Churn in Python
telco.corr()

corr_night_mins.png

Marketing Analytics: Predicting Customer Churn in Python
telco.corr()

corr_intl_mins.png

Marketing Analytics: Predicting Customer Churn in Python
telco.corr()

corr_day_charge.png

Marketing Analytics: Predicting Customer Churn in Python
telco.corr()

corr_eve_charge.png

Marketing Analytics: Predicting Customer Churn in Python
telco.corr()

corr_night_charge.png

Marketing Analytics: Predicting Customer Churn in Python
telco.corr()

corr_intl_charge.png

Marketing Analytics: Predicting Customer Churn in Python

Feature engineering

  • Creating new features to help improve model performance
  • Should consult with business and subject matter experts
Marketing Analytics: Predicting Customer Churn in Python

Examples of feature engineering

  • Total Minutes: Sum of Day_Mins, Eve_Mins, Night_Mins, Intl_Mins
  • Ratio between Minutes and Charge

 

telco['Day_Cost'] = telco['Day_Mins'] / telco['Day_Charge']
Marketing Analytics: Predicting Customer Churn in Python

Let's practice!

Marketing Analytics: Predicting Customer Churn in Python

Preparing Video For Download...