Statistical significance

Hypothesis Testing in R

Richie Cotton

Data Evangelist at DataCamp

p-value recap

p-values quantify evidence for the null hypothesis.
Large p-value → fail to reject null hypothesis.
Small p-value → reject null hypothesis.
Where is the cutoff point?

Significance level

The significance level of a hypothesis test ($\alpha$) is the threshold point for "beyond a reasonable doubt".

Common values of $\alpha$ are 0.1, 0.05, and 0.01.
If $p \le \alpha$, reject $H_{0}$, else fail to reject $H_{0}$.
$\alpha$ should be set prior to conducting the hypothesis test.

Calculating the p-value

alpha <- 0.05

prop_child_samp <- stack_overflow %>%
  summarize(
    point_estimate = mean(age_first_code_cut == "child")
  ) %>%
  pull(point_estimate)
prop_child_hyp <- 0.35
std_error <- 0.0096028
z_score <- (prop_child_samp - prop_child_hyp) / std_error

p_value <- pnorm(z_score, lower.tail = FALSE)

3.818e-05

p_value <= alpha

TRUE

p_value is less than or equal to alpha, so reject $H_{0}$ and accept $H_{A}$.

The proportion of data scientists starting programming as children is greater than 35%.

Confidence intervals

For a significance level of 0.05, it's common to choose a confidence interval of 1 - 0.05 = 0.95.

conf_int <- first_code_boot_distn %>%
  summarize(
    lower = quantile(first_code_child_rate, 0.025),
    upper = quantile(first_code_child_rate, 0.975)
  )

# A tibble: 1 x 2
  lower upper
  <dbl> <dbl>
1 0.369 0.407

Types of errors

	Truly didn't commit crime	Truly committed crime
Verdict not guilty	correct	they got away with it
Verdict guilty	wrongful conviction	correct

	actual $H_{0}$	actual $H_{A}$
chosen $H_{0}$	correct	false negative
chosen $H_{A}$	false positive	correct

False positives are Type I errors; false negatives are Type II errors.

Possible errors in our example

If $p \le \alpha$, we reject $H_{0}$:

A false positive (Type I) error could have occurred: we thought that data scientists started coding as children at a higher rate when in reality they did not.

If $ p \gt \alpha$, we fail to reject $H_{0}$:

A false negative (Type II) error could have occurred: we thought that data scientists coded as children at the same rate as software engineers when in reality they coded as children at a higher rate.

Let's practice!

Hypothesis Testing in R