Statistical significance

Hypothesis Testing in Python

James Chapman

Curriculum Manager, DataCamp

p-value recap

  • p-values quantify evidence for the null hypothesis
  • Large p-value → fail to reject null hypothesis
  • Small p-value → reject null hypothesis
  • Where is the cutoff point?
Hypothesis Testing in Python

Significance level

The significance level of a hypothesis test ($\alpha$) is the threshold point for "beyond a reasonable doubt"

  • Common values of $\alpha$ are 0.2, 0.1, 0.05, and 0.01
  • If $p \le \alpha$, reject $H_{0}$, else fail to reject $H_{0}$
  • $\alpha$ should be set prior to conducting the hypothesis test
Hypothesis Testing in Python

Calculating the p-value

alpha = 0.05

prop_child_samp = (stack_overflow['age_first_code_cut'] == "child").mean() prop_child_hyp = 0.35
std_error = np.std(first_code_boot_distn, ddof=1)
z_score = (prop_child_samp - prop_child_hyp) / std_error
p_value = 1 - norm.cdf(z_score, loc=0, scale=1)
3.1471479512323874e-05
Hypothesis Testing in Python

Making a decision

alpha = 0.05

print(p_value)
3.1471479512323874e-05
p_value <= alpha
True

Reject $H_{0}$ in favor of $H_{A}$

Hypothesis Testing in Python

Confidence intervals

For a significance level of $\alpha$, it's common to choose a confidence interval level of 1 - $\alpha$

  • $\alpha=0.05$ → $95\%$ confidence interval
import numpy as np
lower = np.quantile(first_code_boot_distn, 0.025)
upper = np.quantile(first_code_boot_distn, 0.975)
print((lower, upper))
(0.37063246351172047, 0.41132242370632466)
Hypothesis Testing in Python

Types of errors

Truly didn't commit crime Truly committed crime
Verdict not guilty correct they got away with it
Verdict guilty wrongful conviction correct

 

actual $H_{0}$ actual $H_{A}$
chosen $H_{0}$ correct false negative
chosen $H_{A}$ false positive correct

 

False positives are Type I errors; false negatives are Type II errors.

Hypothesis Testing in Python

Possible errors in our example

If $p \le \alpha$, we reject $H_{0}$:

  • A false positive (Type I) error: data scientists didn't start coding as children at a higher rate

If $ p \gt \alpha$, we fail to reject $H_{0}$:

  • A false negative (Type II) error: data scientists started coding as children at a higher rate
Hypothesis Testing in Python

Let's practice!

Hypothesis Testing in Python

Preparing Video For Download...