Automating figures from data

Introduction to Data Visualization with Matplotlib

Ariel Rokem

Data Scientist

Why automate?

  • Ease and speed
  • Flexibility
  • Robustness
  • Reproducibility
Introduction to Data Visualization with Matplotlib

How many different kinds of data?

summer_2016_medals["Sport"]
ID
62            Rowing
65         Taekwondo
73          Handball
             ...
134759      Handball
135132    Volleyball
135205        Boxing
Name: Sport, Length: 976, dtype: object
Introduction to Data Visualization with Matplotlib

Getting unique values of a column

sports = summer_2016_medals["Sport"].unique()

print(sports)
['Rowing' 'Taekwondo' 'Handball' 'Wrestling' 
'Gymnastics' 'Swimming' 'Basketball' 'Boxing' 
'Volleyball' 'Athletics']
Introduction to Data Visualization with Matplotlib

Bar-chart of heights for all sports

fig, ax = plt.subplots()

for sport in sports:
  sport_df = summer_2016_medals[summer_2016_medals["Sport"] == sport]

ax.bar(sport, sport_df["Height"].mean(), yerr=sport_df["Height"].std())
ax.set_ylabel("Height (cm)") ax.set_xticklabels(sports, rotation=90) plt.show()
Introduction to Data Visualization with Matplotlib

Figure derived automatically from the data

Introduction to Data Visualization with Matplotlib

Practice automating visualizations!

Introduction to Data Visualization with Matplotlib

Preparing Video For Download...