t-Distributed Stochastic Neighborhood Embedding (t-SNE)

Dimensionality Reduction in R

Matt Pickard

Owner, Pickard Predictives, LLC

t-SNE vs PCA table

Dimensionality Reduction in R

t-SNE vs PCA table

Dimensionality Reduction in R

t-SNE vs PCA table

Dimensionality Reduction in R

t-SNE vs PCA table

Dimensionality Reduction in R

t-SNE vs PCA table

Dimensionality Reduction in R

Plotting PCA and t-SNE

PCA

PCA plot

Preserves global structure

t-SNE

t-SNE plot

Preserves local structure (keeps neighbors next to each other)

Dimensionality Reduction in R

t-SNE hyperparameters

  • Perplexity - determines the number of nearest neighbors considered
  • Learning rate - rate the weights of the neural network are adjusted
  • Iterations - number of backpropogation iterations

t-SNE

t-SNE plot

Dimensionality Reduction in R

t-SNE in R

library(Rtsne)

set.seed(1234) tsne <- Rtsne(attrition_df %>% select(-Attrition))
tsne_df <- attrition_df %>% bind_cols(tsne_x = tsne$Y[,1], tsne_y = tsne$Y[,2])
tsne_df %>% ggplot(aes(x = tsne_x, y = tsne_y, color = Attrition)) + geom_point(alpha = 0.5)
Dimensionality Reduction in R

t-SNE plot

t-SNE plot

Dimensionality Reduction in R

Let's practice!

Dimensionality Reduction in R

Preparing Video For Download...