Sterk gecorreleerde features verwijderen

Dimensionality Reduction in Python

Jeroen Boeye

Head of Machine Learning, Faktion

Sterk gecorreleerde data

sterk gecorreleerde pairplot

Dimensionality Reduction in Python

Sterk gecorreleerde features

sterk gecorreleerde matrix

Dimensionality Reduction in Python

Sterk gecorreleerde features verwijderen

# Create positive correlation matrix
corr_df = chest_df.corr().abs()

# Create and apply mask mask = np.triu(np.ones_like(corr_df, dtype=bool))
tri_df = corr_df.mask(mask) tri_df

Dimensionality Reduction in Python

Sterk gecorreleerde features verwijderen

# Find columns that meet threshold 
to_drop = [c for c in tri_df.columns if any(tri_df[c] > 0.95)]

print(to_drop)
['Suprasternale height', 'Cervicale height']
# Drop those columns
reduced_df = chest_df.drop(to_drop, axis=1)
Dimensionality Reduction in Python

Featureselectie

Schema featureselectie

Feature-extractie

Schema feature-extractie

Dimensionality Reduction in Python

Kanttekeningen bij correlatie – Anscombe's quartet

Anscombe's quartet

Dimensionality Reduction in Python

Kanttekeningen bij correlatie – causaliteit

sns.scatterplot(x="N firetrucks sent to fire", 
                y="N wounded by fire",data=fire_df)

brandweerwagens vs. gewonden

Dimensionality Reduction in Python

Laten we oefenen!

Dimensionality Reduction in Python

Preparing Video For Download...