Test statistics and p-values

Statistical Thinking in Python (Part 2)

Justin Bois

Lecturer at the California Institute of Technology

Are OH and PA different?

ch3-2.002.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 2)

Hypothesis testing

  • Assessment of how reasonable the observed data are assuming a hypothesis is true
Statistical Thinking in Python (Part 2)

Test statistic

  • A single number that can be computed from observed data and from data you simulate under the null hypothesis
  • It serves as a basis of comparison between the two
Statistical Thinking in Python (Part 2)

Permutation replicate

np.mean(perm_sample_PA) - np.mean(perm_sample_OH)
1.122220149253728
np.mean(dem_share_PA) - np.mean(dem_share_OH) # orig. data
1.1582360922659518
Statistical Thinking in Python (Part 2)

Mean vote difference under null hypothesis

ch3-2.014.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 2)

Mean vote difference under null hypothesis

ch3-2.016.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 2)

p-value

  • The probability of obtaining a value of your test statistic that is at least as extreme as what was observed, under the assumption the null hypothesis is true
  • NOT the probability that the null hypothesis is true
Statistical Thinking in Python (Part 2)

Statistical significance

  • Determined by the smallness of a p-value
Statistical Thinking in Python (Part 2)

Null hypothesis significance testing (NHST)

  • Another name for what we are doing in this chapter
Statistical Thinking in Python (Part 2)

statistical significance ? practical significance

Statistical Thinking in Python (Part 2)

Let's practice!

Statistical Thinking in Python (Part 2)

Preparing Video For Download...