De randomisatieverdeling gebruiken

Basis van inferentie in R

Jo Hardin

Instructor

De nulverdeling begrijpen

ch1_3_infer.003.png

De nulverdeling begrijpen

ch1_3_infer.004.png

De nulverdeling begrijpen

ch1_3_infer.005.png

De nulverdeling begrijpen

ch1_3_infer.006.png

De nulverdeling begrijpen

ch1_3_infer.007.png

Data consistent met de nulhypothese?

table(soda)

         location
drink    East West
cola     28   19
orange   6    7

soda %>% group_by(location) %>% 
    summarize(mean(drink == "cola"))

# A tibble: 2 × 2
  location `mean(drink == "cola")`
    <fctr>                   <dbl>
1     East               0.8235294
2     West               0.7307692

Significantie

ch1_3_infer.011.png

Hoe extreem zijn de geobserveerde data?

diff_orig <- soda %>%
  group_by(location) %>%
  summarize(prop_cola = mean(drink == "cola")) %>%
  summarize(diff(prop_cola)) %>%
  pull()

 soda_perm <- soda %>%
  specify(drink ~ location, success = "cola") %>%
  hypothesize(null = "independence") %>%
  generate(reps = 100, type = "permute") %>%
  calculate(stat = "diff in props", 
              order = c("west", "east"))

soda_perm %>% 
    summarize(proportion = mean(diff_orig >= stat))

 # A tibble: 1 x 1
  proportion
       <dbl>
1      0.380

Laten we oefenen!

Basis van inferentie in R