Plotting a histogram

Statistical Thinking in Python (Part 1)

Justin Bois

Teaching Professor at the California Institute of Technology

2008 US swing state election results

Data retrieved from Data.gov (https://www.data.gov/)

ch1-2.003.png

Statistical Thinking in Python (Part 1)

Generating a histogram

import matplotlib.pyplot as plt
_ = plt.hist(df_swing['dem_share'])
_ = plt.xlabel('percent of vote for Obama')
_ = plt.ylabel('number of counties')
plt.show()
Statistical Thinking in Python (Part 1)

Always label your axes

Statistical Thinking in Python (Part 1)

2008 US swing state election results

Data retrieved from Data.gov (https://www.data.gov/)

ch1-2.013.png

Statistical Thinking in Python (Part 1)

Histograms with different binning

Data retrieved from Data.gov (https://www.data.gov/)

ch1-2.015.png

Statistical Thinking in Python (Part 1)

Setting the bins of a histogram

bin_edges = [0, 10, 20, 30, 40, 50,
                60, 70, 80, 90, 100]
_ = plt.hist(df_swing['dem_share'], bins=bin_edges)
plt.show()

ch1-2.022.png

Statistical Thinking in Python (Part 1)

Setting the bins of a histogram

_ = plt.hist(df_swing['dem_share'], bins=20)
plt.show()

ch1-2.027.png

Statistical Thinking in Python (Part 1)

Seaborn

  • An excellent Matplotlib-based statistical data visualization package written by Michael Waskom
Statistical Thinking in Python (Part 1)

Setting Seaborn styling

import seaborn as sns
sns.set()
_ = plt.hist(df_swing['dem_share'])
_ = plt.xlabel('percent of vote for Obama')
_ = plt.ylabel('number of counties')
plt.show()
Statistical Thinking in Python (Part 1)

A Seaborn-styled histogram

ch1-2.038.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

Let's practice!

Statistical Thinking in Python (Part 1)

Preparing Video For Download...