ANOVA non-parametrik dan uji t tidak berpasangan

Pengujian Hipotesis di R

Richie Cotton

Data Evangelist at DataCamp

Uji non-parametrik

Uji non-parametrik adalah uji hipotesis yang tidak mengasumsikan distribusi probabilitas untuk statistik uji.

Ada dua jenis uji hipotesis non-parametrik:

  1. Berbasis simulasi.
  2. Berbasis peringkat (rank).
Pengujian Hipotesis di R

t_test()

$H_{0}$: $\mu_{child} - \mu_{adult} = 0$     $H_{A}$: $\mu_{child} - \mu_{adult} > 0$

library(infer)
stack_overflow %>% 
  t_test(
    converted_comp ~ age_first_code_cut,
    order = c("child", "adult"),
    alternative = "greater"
  )
# A tibble: 1 x 6
  statistic  t_df p_value alternative lower_ci upper_ci
      <dbl> <dbl>   <dbl> <chr>          <dbl>    <dbl>
1      2.40 2083. 0.00814 greater        8438.      Inf
Pengujian Hipotesis di R

Menghitung distribusi nol

Alur berbasis simulasi
null_distn <- stack_overflow %>% 
  specify(converted_comp ~ age_first_code_cut) %>%

hypothesize(null = "independence") %>%
generate(reps = 5000, type = "permute") %>%
calculate( stat = "diff in means", order = c("child", "adult") )
Uji t, untuk perbandingan
library(infer)
stack_overflow %>% 
  t_test(
    converted_comp ~ age_first_code_cut,
    order = c("child", "adult"),
    alternative = "greater"
  )
Pengujian Hipotesis di R

Menghitung statistik teramati

Alur berbasis simulasi
obs_stat <- stack_overflow %>% 
  specify(converted_comp ~ age_first_code_cut) %>% 
  calculate(
    stat = "diff in means", 
    order = c("child", "adult")
  )
Uji t, untuk perbandingan
library(infer)
stack_overflow %>% 
  t_test(
    converted_comp ~ age_first_code_cut,
    order = c("child", "adult"),
    alternative = "greater"
  )
Pengujian Hipotesis di R

Dapatkan p-value

Alur berbasis simulasi
get_p_value(
  null_distn, obs_stat, 
  direction = "greater"
)
# A tibble: 1 x 1
  p_value
    <dbl>
1  0.0066
Uji t, untuk perbandingan
library(infer)
stack_overflow %>% 
  t_test(
    converted_comp ~ age_first_code_cut,
    order = c("child", "adult"),
    alternative = "greater"
  )
# A tibble: 1 x 6
  statistic  t_df p_value alternative lower_ci upper_ci
      <dbl> <dbl>   <dbl> <chr>          <dbl>    <dbl>
1      2.40 2083. 0.00814 greater        8438.      Inf
Pengujian Hipotesis di R

Peringkat vektor

x <- c(1, 15, 3, 10, 6)
rank(x)
1 5 2 4 3

Uji Wilcoxon–Mann–Whitney (alias Wilcoxon rank-sum) secara kasar adalah uji t pada peringkat (rank) input numerik.

Pengujian Hipotesis di R

Uji Wilcoxon–Mann–Whitney

wilcox.test(
  converted_comp ~ age_first_code_cut,
  data = stack_overflow,
  alternative = "greater",
  correct = FALSE
) 
    Wilcoxon rank sum test

data:  converted_comp by age_first_code_cut
W = 967298, p-value <2e-16
alternative hypothesis: true location shift is greater than 0
1 Juga dikenal sebagai "Wilcoxon rank-sum test" dan "Mann-Whitney U test".
Pengujian Hipotesis di R

Uji Kruskal–Wallis

Uji Kruskal–Wallis berpadanan dengan Wilcoxon–Mann–Whitney seperti ANOVA dengan uji t.

kruskal.test(
  converted_comp ~ job_sat,
  data = stack_overflow
)
    Kruskal-Wallis rank sum test

data:  converted_comp by job_sat
Kruskal-Wallis chi-square = 81, df = 4, p-value <2e-16
Pengujian Hipotesis di R

Ayo berlatih!

Pengujian Hipotesis di R

Preparing Video For Download...