Cluster Analysis in Python
Shaumik Daityari
Business Analyst
scipy.cluster.hierarchy.linkage(observations,
method='single',
metric='euclidean',
optimal_ordering=False
)
method
: how to calculate the proximity of clustersmetric
: distance metricoptimal_ordering
: order data points'single'
: based on two closest objects'complete'
: based on two farthest objects'average'
: based on the arithmetic mean of all objects'centroid'
: based on the geometric mean of all objects'median'
: based on the median of all objects'ward'
: based on the sum of squaresscipy.cluster.hierarchy.fcluster(distance_matrix,
num_clusters,
criterion
)
distance_matrix
: output of linkage()
methodnum_clusters
: number of clusterscriterion
: how to decide thresholds to form clustersCluster Analysis in Python