Feature extraction

Dimensionality Reduction in Python

Jeroen Boeye

Head of Machine Learning, Faktion

Feature selection

Feature selection schema

Dimensionality Reduction in Python

Feature selection

Feature selection schema

Feature extraction

Feature extraction schema

Dimensionality Reduction in Python

Feature generation - BMI

df_body['BMI'] = df_body['Weight kg'] /  df_body['Height m'] ** 2
Dimensionality Reduction in Python

Feature generation - BMI

df_body['BMI'] = df_body['Weight kg'] /  df_body['Height m'] ** 2
Weight kg Height m BMI
81.5 1.776 25.84
72.6 1.702 25.06
92.9 1.735 30.86
Dimensionality Reduction in Python

Feature generation - BMI

df_body.drop(['Weight kg', 'Height m'], axis=1)
BMI
25.84
25.06
30.86
Dimensionality Reduction in Python

Feature generation - averages

left leg mm right leg mm
882 885
870 869
901 900

leg_df['leg mm'] = leg_df[['right leg mm', 'left leg mm']].mean(axis=1)
Dimensionality Reduction in Python

Feature generation - averages

leg_df.drop(['right leg mm', 'left leg mm'], axis=1)
leg mm
883.5
869.5
900.5
Dimensionality Reduction in Python

Cost of taking the average

Right vs. left leg

Dimensionality Reduction in Python

Cost of taking the average

Right vs. left leg zoomed

Dimensionality Reduction in Python

Cost of taking the average

Right vs. left leg zoomed line

Dimensionality Reduction in Python

Cost of taking the average

Right vs. left leg zoomed line annotated

Dimensionality Reduction in Python

Intro to PCA

sns.scatterplot(data=df, x='handlength', y='footlength')

hand vs. foot length

Dimensionality Reduction in Python

Intro to PCA

scaler = StandardScaler()
df_std = pd.DataFrame(scaler.fit_transform(df), columns = df.columns)

hand vs. foot length

Dimensionality Reduction in Python

Intro to PCA

scaler = StandardScaler()
df_std = pd.DataFrame(scaler.fit_transform(df), columns = df.columns)

hand vs. foot length with point

Dimensionality Reduction in Python

Intro to PCA

scaler = StandardScaler()
df_std = pd.DataFrame(scaler.fit_transform(df), columns = df.columns)

hand vs. foot length with first vector

Dimensionality Reduction in Python

Intro to PCA

scaler = StandardScaler()
df_std = pd.DataFrame(scaler.fit_transform(df), columns = df.columns)

hand vs. foot length with both vectors

Dimensionality Reduction in Python

Let's practice!

Dimensionality Reduction in Python

Preparing Video For Download...