Dimensiereductie: visualisatietechnieken

Machine Learning-sollicitatievragen oefenen in Python

Lisa Stuart

Data Scientist

Waarom dimensiereductie?

  1. Sneller ML trainen
  2. Visualisatie
  3. Betere nauwkeurigheid
Machine Learning-sollicitatievragen oefenen in Python

Visualisatietechnieken

  • PCA
  • t-SNE
Machine Learning-sollicitatievragen oefenen in Python

Visualiseren met PCA

PCA-plot

1 https://districtdatalabs.silvrback.com/principal-component-analysis-with-python
Machine Learning-sollicitatievragen oefenen in Python

Scree-plot

Scree-plot

1 https://towardsdatascience.com/a-step-by-step-explanation-of-principal-component-analysis-b836fb9c97e2
Machine Learning-sollicitatievragen oefenen in Python

t-SNE

  • Probabilistisch
  • Paren van datapunten
  • Laag-dimensionale embedding
  • Embeddings plotten
Machine Learning-sollicitatievragen oefenen in Python

Visualiseren met t-SNE

# t-sne with loan data
from sklearn.manifold import TSNE
import seaborn as sns

loans =  pd.read_csv('loans_dataset.csv')

# Feature matrix
X = loans.drop('Loan Status', axis=1)

tsne = TSNE(n_components=2, verbose=1, perplexity=40)
tsne_results = tsne.fit_transform(X)

loans['t-SNE-PC-one'] = tsne_results[:,0]
loans['t-SNE-PC-two'] = tsne_results[:,1]

# t-sne viz
plt.figure(figsize=(16,10))
sns.scatterplot(
    x="t-SNE-PC-one", y="t-SNE-PC-two",
    hue="Loan Status",
    palette=sns.color_palette(["grey","blue"]),
    data=loans,
    legend="full",
    alpha=0.3
)
1 https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
Machine Learning-sollicitatievragen oefenen in Python

Visualiseren met t-SNE

t-sne-plot

Machine Learning-sollicitatievragen oefenen in Python

PCA vs t-SNE op digits-data

PCA en t-sne op digits-dataset

1 https://towardsdatascience.com/visualising-high-dimensional-datasets-using-pca-and-t-sne-in-python-8ef87e7915b
Machine Learning-sollicitatievragen oefenen in Python

Laten we oefenen!

Machine Learning-sollicitatievragen oefenen in Python

Preparing Video For Download...