Build customer and product segmentation

Machine Learning for Marketing in Python

Karolis Urbonas

Head of Analytics & Science, Amazon

Segmentation steps with K-means

Segmentation with K-means (for k number of clusters):

from sklearn.cluster import KMeans

kmeans=KMeans(n_clusters=k)
kmeans.fit(wholesale_scaled_df)
wholesale_kmeans4 = wholesale.assign(segment = kmeans.labels_)
Machine Learning for Marketing in Python

Segmentation steps with NMF

Segmentation with NMF (k number of clusters):

from sklearn.decomposition import NMF
nmf = NMF(k)
nmf.fit(wholesale)
components = pd.DataFrame(nmf.components_, columns=wholesale.columns)

Extracting segment assignment:

segment_weights = pd.DataFrame(nmf.transform(wholesale, columns=components.index)
segment_weights.index = wholesale.index
wholesale_nmf = wholesale.assign(segment = segment_weights.idxmax(axis=1))
Machine Learning for Marketing in Python

How to initialize the number of segments?

  • Both K-means and NMF require to set a number of clusters (k)
  • Two ways to define k: 1) Mathematically, 2) Test & learn
  • We'll explore mathematical elbow criterion method to get a ball-park estimate
Machine Learning for Marketing in Python

Elbow criterion method

  • Iterate through a number of k values
  • Run clustering for each on the same data
  • Calculate sum of squared errors (SSE) for each
  • Plot SSE against k and identify the "elbow" - diminishing incremental improvements in error reduction
Machine Learning for Marketing in Python

Calculate sum of squared errors and plot the results

sse = {}
for k in range(1, 11):
    kmeans=KMeans(n_clusters=k, random_state=333)
    kmeans.fit(wholesale_scaled_df)
    sse[k] = kmeans.inertia_
plt.title('Elbow criterion method chart')
sns.pointplot(x=list(sse.keys()), y=list(sse.values()))
plt.show()
Machine Learning for Marketing in Python

Identifying the optimal number of segments

Elbow Criterion Method

Machine Learning for Marketing in Python

Test & learn method

  • First, calculate mathematically optimal number of segments
  • Build segmentation with multiple values around the optimal k value
  • Explore the results and choose one with most business relevance (Can you name the segments? Are they ambiguous / overlapping?)
Machine Learning for Marketing in Python

Let's build customer segments!

Machine Learning for Marketing in Python

Preparing Video For Download...