Hypothesis Testing in Python
James Chapman
Curriculum Manager, DataCamp
$H_{0}$: The mean compensation (in USD) is the same for those that coded first as a child and those that coded first as an adult
$H_{A}$: The mean compensation (in USD) is greater for those that coded first as a child compared to those that coded first as an adult
Use a right-tailed test
$\alpha = 0.1$
If $p \le \alpha$ then reject $H_{0}$.
from scipy.stats import norm
1 - norm.cdf(z_score)
$SE(\bar{x}_{\text{child}} - \bar{x}_{\text{adult}}) \approx \sqrt{\dfrac{s_{\text{child}}^2}{n_{\text{child}}} + \dfrac{s_{\text{adult}}^2}{n_{\text{adult}}}}$
z-statistic: needed when using one sample statistic to estimate a population parameter
t-statistic: needed when using multiple sample statistics to estimate a population parameter
numerator = xbar_child - xbar_adult
denominator = np.sqrt(s_child ** 2 / n_child + s_adult ** 2 / n_adult)
t_stat = numerator / denominator
1.8699313316221844
degrees_of_freedom = n_child + n_adult - 2
2259
from scipy.stats import t
1 - t.cdf(t_stat, df=degrees_of_freedom)
0.030811302165157595
Hypothesis Testing in Python