Hypothesis Testing in R
Richie Cotton
Data Evangelist at DataCamp
The significance level of a hypothesis test ($\alpha$) is the threshold point for "beyond a reasonable doubt".
0.1
, 0.05
, and 0.01
.alpha <- 0.05
prop_child_samp <- stack_overflow %>%
summarize(
point_estimate = mean(age_first_code_cut == "child")
) %>%
pull(point_estimate)
prop_child_hyp <- 0.35
std_error <- 0.0096028
z_score <- (prop_child_samp - prop_child_hyp) / std_error
p_value <- pnorm(z_score, lower.tail = FALSE)
3.818e-05
p_value <= alpha
TRUE
p_value
is less than or equal to alpha
, so reject $H_{0}$ and accept $H_{A}$.
The proportion of data scientists starting programming as children is greater than 35%.
For a significance level of 0.05, it's common to choose a confidence interval of 1 - 0.05 = 0.95
.
conf_int <- first_code_boot_distn %>%
summarize(
lower = quantile(first_code_child_rate, 0.025),
upper = quantile(first_code_child_rate, 0.975)
)
# A tibble: 1 x 2
lower upper
<dbl> <dbl>
1 0.369 0.407
Truly didn't commit crime | Truly committed crime | |
---|---|---|
Verdict not guilty | correct | they got away with it |
Verdict guilty | wrongful conviction | correct |
actual $H_{0}$ | actual $H_{A}$ | |
---|---|---|
chosen $H_{0}$ | correct | false negative |
chosen $H_{A}$ | false positive | correct |
False positives are Type I errors; false negatives are Type II errors.
If $p \le \alpha$, we reject $H_{0}$:
If $ p \gt \alpha$, we fail to reject $H_{0}$:
Hypothesis Testing in R