Exploring the patterns

Improving Your Data Visualizations in Python

Nick Strayer

Instructor

Digging in deeper

  • Investigating correlations
  • Are correlations driven by confounding?
  • Anything surprising?

A shovel digging into the ground

Improving Your Data Visualizations in Python

Target audiences

  • Shared with peers
  • Be smart about design decisions
  • Remember they aren't as familiar with data

A cartoon figure giving a generic presentation to peers

Improving Your Data Visualizations in Python
sns.regplot('NO2', 'CO', ci=False, data=pollution,

# Lower opacity of points scatter_kws={'alpha':0.2, 'color':'grey'} )

Scatter plot of NO2 and CO pollution values with a basic line of best fit drawn over

Improving Your Data Visualizations in Python

Profiling patterns

  • Found interesting pattern in data
  • How to quickly explore and explain the pattern?
  • Use text!

Two distinct clusters of points in an unlabeled scatterplot

Improving Your Data Visualizations in Python

Using text scatters to id outliers

Unlabeled scatter plot of Denver's average SO2 and CO values with clear outlier in upper right

Labeled scatter plot of Denver's average SO2 and CO values with clear outlier in upper right seen as January

Improving Your Data Visualizations in Python
g = sns.scatterplot("SO2","CO", data=long_beach_avgs)

# Iterate over the rows of our data
for _, row in long_beach_avgs.iterrows():
    # Unpack columns from row
    month, SO2, CO = row

# Draw annotation in correct place g.annotate(month, (SO2,CO))
plt.title('Long Beach avg SO2 by CO')
Improving Your Data Visualizations in Python

Labeled scatter plot of Long Beach's average SO2 and CO for months of the year

Improving Your Data Visualizations in Python

Let's dig in

Improving Your Data Visualizations in Python

Preparing Video For Download...