Introduction to summary statistics: The sample mean and median

Statistical Thinking in Python (Part 1)

Justin Bois

Teaching Professor at the California Institute of Technology

2008 US swing state election results

ch2-1_v2.003.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

2008 US swing state election results

ch2-1_v2.004.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

Mean vote percentage

import numpy as np
np.mean(dem_share_PA)
45.476417910447765

$$mean = \bar{x} = \frac{1}{n}\sum_{i=1}^{n}x_i$$

Statistical Thinking in Python (Part 1)

Outliers

  • Data points whose value is far greater or less than most of the rest of the data
Statistical Thinking in Python (Part 1)

2008 Utah election results

ch2-1_v2.014.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

2008 Utah election results

ch2-1_v2.015.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

The median

  • The middle value of a data set
Statistical Thinking in Python (Part 1)

2008 Utah election results

ch2-1_v2.018.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 1)

Computing the median

np.median(dem_share_UT)
22.469999999999999
Statistical Thinking in Python (Part 1)

Let's practice!

Statistical Thinking in Python (Part 1)

Preparing Video For Download...