Inference for Categorical Data in R
Andrew Bray
Assistant Professor of Statistics at Reed College
null_spac <- gss_party %>%
specify(natspac ~ party) %>%
hypothesize(null = "independence") %>%
generate(reps = 100, type = "permute") %>%
calculate(stat = "Chisq")
ggplot(null_spac, aes(x = stat)) +
geom_density() +
stat_function(
fun = dchisq,
args = list(df = 4),
color = "blue"
) +
geom_vline(xintercept = chi_obs_spac, color = "red")
gss_party %>%
select(natarms, party) %>%
table()
party
natarms D I R
TOO LITTLE 17 20 24
ABOUT RIGHT 14 28 8
TOO MUCH 12 24 2
pchisq(chi_obs_spac, df = 4)
X-squared
0.1430612
1 - pchisq(chi_obs_spac, df = 4)
X-squared
0.8569388
Becomes a good approximation when:
Inference for Categorical Data in R