Comparing groups

Improving Your Data Visualizations in Python

Nick Strayer

Instructor

What does this mean?

  • Values generally higher?

  • Distribution of values wider? Narrower?

  • Crucial for representing your data

clusters of people standing together

Improving Your Data Visualizations in Python

comparing classes with kernel density plots

Improving Your Data Visualizations in Python

kernels stacking up to make KDE

Improving Your Data Visualizations in Python
pollution_nov = pollution[pollution.month == 10]

sns.kdeplot(pollution_nov[pollution_nov.city == 'Denver'].O3, color = 'red') sns.kdeplot(pollution_nov[pollution_nov.city != 'Denver'].O3)

Two overlapping KDEs

Improving Your Data Visualizations in Python
# Enable rugplot
sns.kdeplot(pollution_nov[pollution_nov.city == 'Denver'].O3, color='red')
sns.rugplot(pollution_nov[pollution_nov.city == 'Denver'].O3, color='red')
sns.kdeplot(pollution_nov[pollution_nov.city != 'Denver'].O3)

Two overlapping KDEs with a rug plot

Improving Your Data Visualizations in Python

A lot of overlapping KDEs

Improving Your Data Visualizations in Python

abstract basic beeswarm plots around multiple axes

Improving Your Data Visualizations in Python
pollution_nov = pollution[pollution.month == 10]

sns.swarmplot(y="city", x="O3", data=pollution_nov, size=4)
plt.xlabel("Ozone (O3)")

Beeswarm plot of o3 pollution

Improving Your Data Visualizations in Python

Let's compare!

Improving Your Data Visualizations in Python

Preparing Video For Download...