Making sense of the K-means clusters

Cluster Analysis in R

Dmitriy (Dima) Gorenshteyn

Lead Data Scientist, Memorial Sloan Kettering Cancer Center

Wholesale dataset

  • 45 observations
  • 3 features:
    • Milk Spending
    • Grocery Spending
    • Frozen Food Spending
print(customers_spend)
    Milk Grocery Frozen
1  11103   12469    902
2   2013    6550    909
3   1897    5234    417
4   1304    3643   3045
5   3199    6986   1455
...  ...     ...    ...
Cluster Analysis in R

Segmenting with hierarchical clustering

Cluster Analysis in R

Segmenting with hierarchical clustering

cluster Milk Grocery Frozen cluster size
1 16950 12891 991 5
2 2512 5228 1795 29
3 10452 22550 1354 5
4 1249 3916 10888 6
Cluster Analysis in R

Segmenting with K-means

  • Estimate the "best" k using average silhouette width
  • Run k-means with the suggested k
  • Characterize the spending habits of these clusters of customers
Cluster Analysis in R

Let's cluster!

Cluster Analysis in R

Preparing Video For Download...