Plot all of your data: ECDFs

Statistical Thinking in Python (Part 1)

Justin Bois

Teaching Professor at the California Institute of Technology

2008 US swing state election results

ch1-4_v2.003.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

2008 US election results: East and West

ch1-4_v2.005.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

Empirical cumulative distribution function (ECDF)

ch1-4_v2.007.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

Empirical cumulative distribution function (ECDF)

ch1-4_v2.008.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

Empirical cumulative distribution function (ECDF)

ch1-4_v2.009.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

Making an ECDF

import numpy as np

x = np.sort(df_swing['dem_share'])
y = np.arange(1, len(x)+1) / len(x)
_ = plt.plot(x, y, marker='.', linestyle='none')
_ = plt.xlabel('percent of vote for Obama') _ = plt.ylabel('ECDF')
plt.margins(0.02) # Keeps data off plot edges plt.show()
Statistical Thinking in Python (Part 1)

2008 US swing state election ECDF

ch1-4_v2.022.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

2008 US swing state election ECDFs

ch1-4_v2.024.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

Let's practice!

Statistical Thinking in Python (Part 1)

Preparing Video For Download...