Factor relationships and distributions

Exploratory Data Analysis in Python

Izzy Weber

Curriculum Manager, DataCamp

Level of education: male partner

divorce["education_man"].value_counts()
Professional    1313
Preparatory      501
Secondary        288
Primary          100
None               4
Other              3
Name: education_man, dtype: int64
Exploratory Data Analysis in Python

Exploring categorical relationships

sns.histplot(data=divorce, x="marriage_duration", binwidth=1)
plt.show()

Histogram of marriage duration

Exploratory Data Analysis in Python

Exploring categorical relationships

sns.histplot(data=divorce, x="marriage_duration", hue="education_man", binwidth=1)
plt.show()

Histogram of marriage duration color coded by education_man

Exploratory Data Analysis in Python

Kernel Density Estimate (KDE) plots

sns.kdeplot(data=divorce, x="marriage_duration", hue="education_man")
plt.show()

marriage duration kde with hue set to education_man

Exploratory Data Analysis in Python

Kernel Density Estimate (KDE) plots

marriage duration kde with hue set to education_man, zoomed in to marriage_duration of zero

Exploratory Data Analysis in Python

Kernel Density Estimate (KDE) plots

sns.kdeplot(data=divorce, x="marriage_duration", hue="education_man", cut=0)
plt.show()

marriage duration kde with hue set to education_man and cut equal to zero

Exploratory Data Analysis in Python

Cumulative KDE plots

sns.kdeplot(data=divorce, x="marriage_duration", hue="education_man", cut=0, cumulative=True)
plt.show()

marriage duration cumulative distribution function with hue set to education_man and cut equal to zero

Exploratory Data Analysis in Python

Relationship between marriage age and education

  • Is there a relationship between age at marriage and education level?
divorce["man_age_marriage"] = divorce["marriage_year"] - divorce["dob_man"].dt.year
divorce["woman_age_marriage"] = divorce["marriage_year"] - divorce["dob_woman"].dt.year
Exploratory Data Analysis in Python

Scatter plot with categorical variables

sns.scatterplot(data=divorce, x="woman_age_marriage", y="man_age_marriage")
plt.show()

A scatterplot of woman_age_marriage and man_age_marriage

Exploratory Data Analysis in Python

Scatter plot with categorical variables

sns.scatterplot(data=divorce, 
                x="woman_age_marriage",
                y="man_age_marriage", 
                hue="education_man")
plt.show()

A scatterplot of woman_age_marriage and man_age_marriage with hue set to education_man

Exploratory Data Analysis in Python

Let's practice!

Exploratory Data Analysis in Python

Preparing Video For Download...