Comparing more than two observations

Cluster Analysis in R

Dmitriy Gorenshteyn

Lead Data Scientist, Memorial Sloan Kettering Cancer Center

The closest observation to a pair

1 2 3
2 11.7
3 16.8 18.0
4 10.0 20.6 15.8
  • Is 2 is closest to group 1,4?
  • Is 3 is closest to group 1,4?
Cluster Analysis in R

Linkage criteria: complete

1 2 3
2 11.7
3 16.8 18.0
4 10.0 20.6 15.8
  • Is 2 is closest to group 1,4?
    • max(D(2,1), D(2,4)) = 20.6
  • Is 3 is closest to group 1,4?
    • max(D(3,1), D(3,4)) = 16.8
Cluster Analysis in R

Hierarchical clustering

Complete Linkage: maximum distance between two sets

Cluster Analysis in R

Grouping with linkage & distance

Cluster Analysis in R

Grouping with linkage & distance

Cluster Analysis in R

Grouping with linkage & distance

Cluster Analysis in R

Grouping with linkage & distance

Cluster Analysis in R

Grouping with linkage & distance

Cluster Analysis in R

Grouping with linkage & distance

Cluster Analysis in R

Grouping with linkage & distance

Cluster Analysis in R

Grouping with linkage & distance

Cluster Analysis in R

Grouping with linkage & distance

Cluster Analysis in R

Grouping with linkage & distance

Cluster Analysis in R

Linkage criteria

Complete Linkage: maximum distance between two sets

Single Linkage: minimum distance between two sets

Average Linkage: average distance between two sets

Cluster Analysis in R

Let's practice!

Cluster Analysis in R

Preparing Video For Download...