Test statistics and p-values

Statistical Thinking in Python (Part 2)

Justin Bois

Lecturer at the California Institute of Technology

Are OH and PA different?

ch3-2.002.png

¹ Data retrieved from Data.gov (https://www.data.gov/)

Assessment of how reasonable the observed data are assuming a hypothesis is true

A single number that can be computed from observed data and from data you simulate under the null hypothesis
It serves as a basis of comparison between the two

np.mean(perm_sample_PA) - np.mean(perm_sample_OH)

1.122220149253728

np.mean(dem_share_PA) - np.mean(dem_share_OH) # orig. data

1.1582360922659518

ch3-2.014.png

¹ Data retrieved from Data.gov (https://www.data.gov/)

ch3-2.016.png

¹ Data retrieved from Data.gov (https://www.data.gov/)

The probability of obtaining a value of your test statistic that is at least as extreme as what was observed, under the assumption the null hypothesis is true
NOT the probability that the null hypothesis is true

Statistical Thinking in Python (Part 2)