Dimensionality reduction: visualization techniques

Practicing Machine Learning Interview Questions in Python

Lisa Stuart

Data Scientist

Why dimensionality reduction?

  1. Speed up ML training
  2. Visualization
  3. Improves accuracy
Practicing Machine Learning Interview Questions in Python

Visualization techniques

  • PCA
  • t-SNE
Practicing Machine Learning Interview Questions in Python

Visualizing with PCA

PCA plot

1 https://districtdatalabs.silvrback.com/principal-component-analysis-with-python
Practicing Machine Learning Interview Questions in Python

Scree plot

Scree plot

1 https://towardsdatascience.com/a-step-by-step-explanation-of-principal-component-analysis-b836fb9c97e2
Practicing Machine Learning Interview Questions in Python

t-SNE

  • Probabilistic
  • Pairs of data points
  • Low-dimensional embedding
  • Plot embeddings
Practicing Machine Learning Interview Questions in Python

Visualizing with t-SNE

# t-sne with loan data
from sklearn.manifold import TSNE
import seaborn as sns

loans =  pd.read_csv('loans_dataset.csv')

# Feature matrix
X = loans.drop('Loan Status', axis=1)

tsne = TSNE(n_components=2, verbose=1, perplexity=40)
tsne_results = tsne.fit_transform(X)

loans['t-SNE-PC-one'] = tsne_results[:,0]
loans['t-SNE-PC-two'] = tsne_results[:,1]

# t-sne viz
plt.figure(figsize=(16,10))
sns.scatterplot(
    x="t-SNE-PC-one", y="t-SNE-PC-two",
    hue="Loan Status",
    palette=sns.color_palette(["grey","blue"]),
    data=loans,
    legend="full",
    alpha=0.3
)
1 https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
Practicing Machine Learning Interview Questions in Python

Visualizing with t-SNE

t-sne plot

Practicing Machine Learning Interview Questions in Python

PCA vs t-SNE digits data

PCA and t-sne on digits dataset

1 https://towardsdatascience.com/visualising-high-dimensional-datasets-using-pca-and-t-sne-in-python-8ef87e7915b
Practicing Machine Learning Interview Questions in Python

Let's practice!

Practicing Machine Learning Interview Questions in Python

Preparing Video For Download...