Cluster Analysis in Python
Shaumik Daityari
Business Analyst
# Declaring variables for use
distortions = []
num_clusters = range(2, 7)
# Populating distortions for various clusters
for i in num_clusters:
centroids, distortion = kmeans(df[['scaled_x', 'scaled_y']], i)
distortions.append(distortion)
# Plotting elbow plot data
elbow_plot_data = pd.DataFrame({'num_clusters': num_clusters,
'distortions': distortions})
sns.lineplot(x='num_clusters', y='distortions',
data = elbow_plot_data)
plt.show()
Cluster Analysis in Python