Basis van inferentie in R
Jo Hardin
Instructor






Een verdeling van de statistiek uit de nulpopulatie genereren laat zien of de geobserveerde data onverenigbaar zijn met de nulhypothese
Originele data
| Locatie | Cola | Sinaasappel |
|---|---|---|
| Oost | 28 | 6 |
| West | 19 | 7 |
$\hat{p}_\text{east} = 28/(28 + 6) = 0.82$
$\hat{p}_\text{west} = 19/(19 + 7) = 0.73$
Eerste shuffle, gelijk aan origineel
| Locatie | Cola | Sinaasappel |
|---|---|---|
| Oost | 28 | 6 |
| West | 19 | 7 |

Tweede shuffle
| Locatie | Cola | Sinaasappel |
|---|---|---|
| Oost | 27 | 7 |
| West | 20 | 6 |

Derde shuffle
| Locatie | Cola | Sinaasappel |
|---|---|---|
| Oost | 28 | 8 |
| West | 21 | 5 |

Vierde shuffle
| Locatie | Cola | Sinaasappel |
|---|---|---|
| Oost | 25 | 9 |
| West | 22 | 4 |

Vijfde shuffle
| Locatie | Cola | Sinaasappel |
|---|---|---|
| Oost | 29 | 5 |
| West | 18 | 8 |

Vijfde shuffle
| Locatie | Cola | Sinaasappel |
|---|---|---|
| Oost | 29 | 5 |
| West | 18 | 8 |







soda %>%
group_by(location) %>%
summarize(prop_cola =
mean(drink == "cola")) %>%
summarize(diff(prop_cola))
# A tibble: 1 x 1
`diff(prop_cola)`
<dbl>
1 -0.09276018
library(infer)
soda %>% specify(drink ~ location,
success = "cola") %>%
hypothesize(null = "independence") %>%
generate(reps = 1, type = "permute") %>%
calculate(stat = "diff in props",
order = c("west","east"))
# A tibble: 1 x 2
replicate stat
<int> <dbl>
1 1 -0.02488688
soda %>%
specify(drink ~ location, success = "cola") %>%
hypothesize(null = "independence") %>%
generate(reps = 5, type = "permute") %>%
calculate(stat = "diff in props", order = c("west", "east"))
# A tibble: 5 x 2
replicate stat
<int> <dbl>
1 1 0.04298643
2 2 -0.09276018
3 3 0.11085973
4 4 0.17873303
5 5 -0.16063348

Basis van inferentie in R