t-SNE-visualisatie van hoog-dimensionale data

Dimensionality Reduction in Python

Jeroen Boeye

Head of Machine Learning, Faktion

t-SNE op IRIS-dataset

iris-clusters

Dimensionality Reduction in Python

t-SNE op IRIS-dataset

iris-clusters 1 geannoteerd

Dimensionality Reduction in Python

t-SNE op IRIS-dataset

iris-clusters 3 geannoteerd

Dimensionality Reduction in Python

t-SNE op vrouwelijke ANSUR-dataset

iris-clusters 1 geannoteerd DNT_CURLY_TAG_1


df.shape

(1986, 99)
non_numeric = ['BMI_class', 'Height_class', 'Gender', 'Component', 'Branch'] df_numeric = df.drop(non_numeric, axis=1) df_numeric.shape
(1986, 94)
Dimensionality Reduction in Python

t-SNE fitten

from sklearn.manifold import TSNE

m = TSNE(learning_rate=50)
tsne_features = m.fit_transform(df_numeric)

tsne_features[1:4,:]
array([[-37.962185,  15.066088],
       [-21.873512,  26.334448],
       [ 13.97476 ,  22.590828]], dtype=float32)
Dimensionality Reduction in Python

t-SNE-features aan onze dataset toewijzen

tsne_features[1:4,:]
array([[-37.962185,  15.066088],
       [-21.873512,  26.334448],
       [ 13.97476 ,  22.590828]], dtype=float32)
df['x'] = tsne_features[:,0]

df['y'] = tsne_features[:,1]
Dimensionality Reduction in Python

t-SNE plotten

import seaborn as sns

sns.scatterplot(x="x", y="y", data=df)

plt.show()
Dimensionality Reduction in Python

t-SNE plotten

ansur-puntenwolk

Dimensionality Reduction in Python

Punten kleuren per BMI-categorie

import seaborn as sns
import matplotlib.pyplot as plt

sns.scatterplot(x="x", y="y", hue='BMI_class', data=df)

plt.show()
Dimensionality Reduction in Python

Punten kleuren per BMI-categorie

ansur-puntenwolk BMI

Dimensionality Reduction in Python

Punten kleuren per BMI-categorie

ansur-puntenwolk BMI geannoteerd

Dimensionality Reduction in Python

Punten kleuren per lengtecategorie

import seaborn as sns

import matplotlib.pyplot as plt

sns.scatterplot(x="x", y="y", hue='Height_class', data=df)

plt.show()
Dimensionality Reduction in Python

Punten kleuren per lengtecategorie

ansur-puntenwolk lengte

Dimensionality Reduction in Python

Punten kleuren per lengtecategorie

ansur-puntenwolk dubbel geannoteerd

Dimensionality Reduction in Python

Laten we oefenen!

Dimensionality Reduction in Python

Preparing Video For Download...