p-values

Hypothesis Testing in Python

James Chapman

Curriculum Manager, DataCamp

Criminal trials

  • Two possible true states:
    1. Defendant committed the crime
    2. Defendant did not commit the crime
  • Two possible verdicts:
    1. Guilty
    2. Not guilty
  • Initially the defendant is assumed to be not guilty
  • Prosecution must present evidence "beyond reasonable doubt" for a guilty verdict
Hypothesis Testing in Python

Age of first programming experience

  • age_first_code_cut classifies when Stack Overflow user first started programming
    • "adult" means they started at 14 or older
    • "child" means they started before 14
  • Previous research: 35% of software developers started programming as children
  • Evidence that a greater proportion of data scientists starting programming as children?
Hypothesis Testing in Python

Definitions

A hypothesis is a statement about an unknown population parameter

A hypothesis test is a test of two competing hypotheses

  • The null hypothesis ($H_{0}$) is the existing idea

  • The alternative hypothesis ($H_{A}$) is the new "challenger" idea of the researcher

For our problem:

  • $H_{0}$: The proportion of data scientists starting programming as children is 35%
  • $H_{A}$: The proportion of data scientists starting programming as children is greater than 35%
1 "Naught" is British English for "zero". For historical reasons, "H-naught" is the international convention for pronouncing the null hypothesis.
Hypothesis Testing in Python

Criminal trials vs. hypothesis testing

  • Either $H_{A}$ or $H_{0}$ is true (not both)
  • Initially, $H_{0}$ is assumed to be true
  • The test ends in either "reject $H_{0}$" or "fail to reject $H_{0}$"
  • If the evidence from the sample is "significant" that $H_{A}$ is true, reject $H_{0}$, else choose $H_{0}$

Significance level is "beyond a reasonable doubt" for hypothesis testing

Hypothesis Testing in Python

One-tailed and two-tailed tests

Density plot of the pdf of the standard normal distribution with the left and right tails highlighted in red.

Hypothesis tests check if the sample statistics lie in the tails of the null distribution

Test Tails
alternative different from null two-tailed
alternative greater than null right-tailed
alternative less than null left-tailed

 

$H_{A}$: The proportion of data scientists starting programming as children is greater than 35%

This is a right-tailed test

Hypothesis Testing in Python

p-values

p-values: probability of obtaining a result, assuming the null hypothesis is true

  • Large p-value, large support for $H_{0}$
    • Statistic likely not in the tail of the null distribution
  • Small p-value, strong evidence against $H_{0}$
    • Statistic likely in the tail of the null distribution
  • "p" in p-value → probability
  • "small" means "close to zero"
Hypothesis Testing in Python

Calculating the z-score

prop_child_samp = (stack_overflow['age_first_code_cut'] == "child").mean()
0.39141972578505085
prop_child_hyp = 0.35
std_error = np.std(first_code_boot_distn, ddof=1)
0.010351057228878566
z_score = (prop_child_samp - prop_child_hyp) / std_error
4.001497129152506
Hypothesis Testing in Python

Calculating the p-value

  • norm.cdf() is normal CDF from scipy.stats.
  • Left-tailed test → use norm.cdf().
  • Right-tailed test → use 1 - norm.cdf().

 

from scipy.stats import norm
1 - norm.cdf(z_score, loc=0, scale=1)
3.1471479512323874e-05
Hypothesis Testing in Python

Let's practice!

Hypothesis Testing in Python

Preparing Video For Download...