Cluster Analysis in Python
Shaumik Daityari
Business Analyst
scipy.cluster.hierarchy.linkage(observations,
method='single',
metric='euclidean',
optimal_ordering=False
)
method: how to calculate the proximity of clustersmetric: distance metricoptimal_ordering: order data points'single': based on two closest objects'complete': based on two farthest objects'average': based on the arithmetic mean of all objects'centroid': based on the geometric mean of all objects'median': based on the median of all objects'ward': based on the sum of squaresscipy.cluster.hierarchy.fcluster(distance_matrix,
num_clusters,
criterion
)
distance_matrix: output of linkage() methodnum_clusters: number of clusterscriterion: how to decide thresholds to form clusters


Cluster Analysis in Python