Customer Segmentation in Python
Karolis Urbonas
Head of Data Science, Amazon
Behavioral customer segmentation based on three metrics:
The RFM values can be grouped in several ways:
We are going to implement percentile-based grouping.
Process of calculating percentiles:
Data with eight CustomerID and a randomly calculated Spend values.
spend_quartiles = pd.qcut(data['Spend'], q=4, labels=range(1,5))data['Spend_Quartile'] = spend_quartilesdata.sort_values('Spend')
# Create numbered labels r_labels = list(range(4, 0, -1))# Divide into groups based on quartiles recency_quartiles = pd.qcut(data['Recency_Days'], q=4, labels=r_labels)# Create new column data['Recency_Quartile'] = recency_quartiles# Sort recency values from lowest to highest data.sort_values('Recency_Days')
As you can see, the quartile labels are reversed, since the more recent customers are more valuable.
We can define a list with string or any other values, depending on the use case.
# Create string labels r_labels = ['Active', 'Lapsed', 'Inactive', 'Churned']# Divide into groups based on quartiles recency_quartiles = pd.qcut(data['Recency_Days'], q=4, labels=r_labels) # Create new column data['Recency_Quartile'] = recency_quartiles # Sort values from lowest to highest data.sort_values('Recency_Days')
Custom labels assigned to each quartile
Customer Segmentation in Python