Hypothesis Testing in R
Richie Cotton
Data Evangelist at DataCamp
The samples are random subsets of larger populations.
Each observation (row) in the dataset is independent.
The sample is big enough to mitigate uncertainty, and so that the Central Limit Theorem applies.
$n \ge 30$
$n$: sample size
$n_{1} \ge 30, n_{2} \ge 30$
$n_{i}$: sample size for group $i$
Number of rows in your data $\ge 30$
$n_{i} \ge 30$ for all values of $i$
$n \times \hat{p} \ge 10$
$n \times (1 - \hat{p}) \ge 10$
$n$: sample size
$\hat{p}$: proportion of successes in sample
$n_{1} \times \hat{p}_{1} \ge 10$
$n_{2} \times \hat{p}_{2} \ge 10$
$n_{1} \times (1 - \hat{p}_{1}) \ge 10$
$n_{2} \times (1 - \hat{p}_{2}) \ge 10$
$n_{i} \times \hat{p}_{i} \ge 5$ for all values of $i$
$n_{i} \times (1 - \hat{p}_{i}) \ge 5$ for all values of $i$
$n_{i}$: sample size for group $i$
$\hat{p}_{i}$: proportion of successes in sample group $i$
If the bootstrap distribution doesn't look normal, assumptions likely aren't valid.
Hypothesis Testing in R