The importance of scale

Analisis Klaster di R

Dmitriy (Dima) Gorenshteyn

Lead Data Scientist, Memorial Sloan Kettering Cancer Center

Distance between individuals

Observation Height (feet) Weight (lbs)
1 6.0 200
2 6.0 202
3 8.0 200
... ... ...
... ... ...
Analisis Klaster di R

Distance between individuals

Analisis Klaster di R

Distance between individuals

Analisis Klaster di R

Distance between individuals

Analisis Klaster di R

Distance between individuals

Analisis Klaster di R

Distance between individuals

Analisis Klaster di R

Scaling our features

   $$height_{scaled} = \frac{height - mean(height)}{sd(height)}$$

Analisis Klaster di R

Distance between individuals

Analisis Klaster di R

Distance between individuals

Analisis Klaster di R

scale() function

print(height_weight)
  Height Weight
1      6    200
2      6    202
3      8    200
...   ...    ...
scale(height_weight)
   Height   Weight
1    0.60    0.67
2    0.60    0.73
3    11.3    0.67
...   ...    ...
Analisis Klaster di R

Let's practice!

Analisis Klaster di R

Preparing Video For Download...