Cluster Analysis in R
Dmitriy (Dima) Gorenshteyn
Lead Data Scientist, Memorial Sloan Kettering Cancer Center
model <- kmeans(x = lineup, centers = 2)
model$tot.withinss
[1] 1434.5
library(purrr) tot_withinss <- map_dbl(1:10, function(k){ model <- kmeans(x = lineup, centers = k) model$tot.withinss })
elbow_df <- data.frame( k = 1:10, tot_withinss = tot_withinss ) print(elbow_df)
k tot_withinss
1 1 3489.9167
2 2 1434.5000
3 3 881.2500
4 4 637.2500
... ... ...
ggplot(elbow_df, aes(x = k, y = tot_withinss)) +
geom_line() +
scale_x_continuous(breaks = 1:10)
Cluster Analysis in R