Hypothesis Testing in R
Richie Cotton
Data Evangelist at DataCamp
Een niet-parametrische toets is een hypothesetoets die geen verdeling voor de toetsingsgrootheid aanneemt.
Er zijn twee typen niet-parametrische toetsen:
$H_{0}$: $\mu_{child} - \mu_{adult} = 0$ $H_{A}$: $\mu_{child} - \mu_{adult} > 0$
library(infer)
stack_overflow %>%
t_test(
converted_comp ~ age_first_code_cut,
order = c("child", "adult"),
alternative = "greater"
)
# A tibble: 1 x 6
statistic t_df p_value alternative lower_ci upper_ci
<dbl> <dbl> <dbl> <chr> <dbl> <dbl>
1 2.40 2083. 0.00814 greater 8438. Inf
null_distn <- stack_overflow %>% specify(converted_comp ~ age_first_code_cut) %>%hypothesize(null = "independence") %>%generate(reps = 5000, type = "permute") %>%calculate( stat = "diff in means", order = c("child", "adult") )
library(infer)
stack_overflow %>%
t_test(
converted_comp ~ age_first_code_cut,
order = c("child", "adult"),
alternative = "greater"
)
obs_stat <- stack_overflow %>%
specify(converted_comp ~ age_first_code_cut) %>%
calculate(
stat = "diff in means",
order = c("child", "adult")
)
library(infer)
stack_overflow %>%
t_test(
converted_comp ~ age_first_code_cut,
order = c("child", "adult"),
alternative = "greater"
)
get_p_value(
null_distn, obs_stat,
direction = "greater"
)
# A tibble: 1 x 1
p_value
<dbl>
1 0.0066
library(infer)
stack_overflow %>%
t_test(
converted_comp ~ age_first_code_cut,
order = c("child", "adult"),
alternative = "greater"
)
# A tibble: 1 x 6
statistic t_df p_value alternative lower_ci upper_ci
<dbl> <dbl> <dbl> <chr> <dbl> <dbl>
1 2.40 2083. 0.00814 greater 8438. Inf
x <- c(1, 15, 3, 10, 6)
rank(x)
1 5 2 4 3
Een Wilcoxon-Mann-Whitney-toets (ook wel Wilcoxon rangsomtoets) is (grobweg) een t-toets op de rangen van de numerieke invoer.
wilcox.test(
converted_comp ~ age_first_code_cut,
data = stack_overflow,
alternative = "greater",
correct = FALSE
)
Wilcoxon rank sum test
data: converted_comp by age_first_code_cut
W = 967298, p-value <2e-16
alternative hypothesis: true location shift is greater than 0
De Kruskal-Wallis-toets verhoudt zich tot de Wilcoxon-Mann-Whitney-toets zoals ANOVA tot de t-toets.
kruskal.test(
converted_comp ~ job_sat,
data = stack_overflow
)
Kruskal-Wallis rank sum test
data: converted_comp by job_sat
Kruskal-Wallis chi-square = 81, df = 4, p-value <2e-16
Hypothesis Testing in R