Evaluating different values of K by eye

Cluster Analysis in R

Dmitriy (Dima) Gorenshteyn

Lead Data Scientist, Memorial Sloan Kettering Cancer Center

Total within-cluster sum of squares: k = 1

Cluster Analysis in R

Total within-cluster sum of squares: k = 2

Cluster Analysis in R

Total within-cluster sum of squares: k = 3

Cluster Analysis in R

Total within-cluster sum of squares: k = 4

Cluster Analysis in R

Elbow plot

Cluster Analysis in R

Elbow plot

Cluster Analysis in R

Generating the elbow plot

model <- kmeans(x = lineup, centers = 2)
model$tot.withinss
[1] 1434.5
Cluster Analysis in R

Generating the elbow plot

library(purrr)

tot_withinss <- map_dbl(1:10,  function(k){
  model <- kmeans(x = lineup, centers = k)
  model$tot.withinss
})

elbow_df <- data.frame( k = 1:10, tot_withinss = tot_withinss ) print(elbow_df)
    k tot_withinss
1   1    3489.9167
2   2    1434.5000
3   3     881.2500
4   4     637.2500
... ...   ...
Cluster Analysis in R

Generating the elbow plot

ggplot(elbow_df, aes(x = k, y = tot_withinss)) +
  geom_line() +
  scale_x_continuous(breaks = 1:10)

Cluster Analysis in R

Let's practice!

Cluster Analysis in R

Preparing Video For Download...