Occupational wage data

Cluster Analysis in R

Dmitriy (Dima) Gorenshteyn

Lead Data Scientist, Memorial Sloan Kettering Cancer Center

Occupational wage data

  • 22 Occupation Observations
  • 15 Measurements of Average Income from 2001-2016
Cluster Analysis in R

Occupational wage data

print(oes)
                           2001  2002  2003  2004  2005 ...
Management                70800 78870 83400 87090 88450 ...
Business Operations       50580 53350 56000 57120 57930 ...
Computer Science          60350 61630 64150 66370 67100 ...
Architecture/Engineering  56330 58020 60390 63060 63910 ...
Life/Physical/Social Sci. 49710 52380 54930 57550 58030 ...
Community Services        34190 34630 35800 37050 37530 ...
...                       ...   ...   ...   ...   ...   ...
Cluster Analysis in R

Occupational wage data

Cluster Analysis in R

Next steps: hierarchical clustering

  • Evaluate whether pre-processing is necessary
  • Create a distance matrix
  • Build a dendrogram
  • Extract clusters from dendrogram
  • Explore resulting clusters
Cluster Analysis in R

Let's practice!

Cluster Analysis in R

Preparing Video For Download...