Hypothesis tests and z-scores

Hypothesis Testing in R

Richie Cotton

Data Evangelist at DataCamp

A/B testing

Electronic Arts (EA) is a video game company.
In 2013, they released SimCity 5.
Their goal was to increase pre-orders of the game.
They used A/B testing to test different advertising scenarios.
This involves splitting users into control and treatment groups.

Electronic Arts building

¹ Image credit: "Electronic Arts" by majaX1 CC BY-NC-SA 2.0

Retail webpage A/B test

Control

SimCity webpage with banner that says "pre-order and get $20 off your next purchase"

Treatment

SimCity webpage without banner

A/B test results

The treatment group (no ad) got 43.4% more purchases than the control group (with ad).
The intuition that "showing an ad would increase sales" was completely wrong.
Was this result statistically significant or just by chance?
You need EA's data to determine this.
You'd use techniques from Sampling in R + this course to do so.

Stack Overflow Developer Survey 2020

library(dplyr)
glimpse(stack_overflow)

Rows: 2,261
Columns: 8
$ respondent         <dbl> 36, 47, 69, 125, 147, 152, 166, 170, 187, 196, 221,…
$ age_first_code_cut <chr> "adult", "child", "child", "adult", "adult", "adult…
$ converted_comp     <dbl> 77556, 74970, 594539, 2000000, 37816, 121980, 48644…
$ job_sat            <fct> Slightly satisfied, Very satisfied, Very satisfied,…
$ purple_link        <chr> "Hello, old friend", "Hello, old friend", "Hello, o…
$ age_cat            <chr> "At least 30", "At least 30", "Under 30", "At least…
$ age                <dbl> 34, 53, 25, 41, 28, 30, 28, 26, 43, 23, 24, 35, 37,…
$ hobbyist           <chr> "Yes", "Yes", "Yes", "Yes", "No", "Yes", "Yes", "Ye…

Hypothesizing about the mean

A hypothesis:

The mean annual compensation of the population of data scientists is $110,000.

The point estimate (sample statistic):

mean_comp_samp <- mean(stack_overflow$converted_comp)

mean_comp_samp <- stack_overflow %>% 
  summarize(mean_compensation = mean(converted_comp)) %>% 
  pull(mean_compensation)

119574.7

Generating a bootstrap distribution

# Step 3. Repeat steps 1 & 2 many times
so_boot_distn <- replicate(
  n = 5000,
  expr = {

    # Step 1. Resample
    stack_overflow %>%
      slice_sample(prop = 1, replace = TRUE) %>%

      # Step 2. Calculate point estimate
      summarize(mean_compensation = mean(converted_comp)) %>% 
      pull(mean_compensation)

}
)

¹ Bootstrap distributions are taught in Chapter 4 of Sampling in R

Visualizing the bootstrap distribution

tibble(resample_mean = so_boot_distn) %>%
  ggplot(aes(resample_mean)) +
  geom_histogram(binwidth = 1000)

Histogram of the bootstrap distribution - it's bell shaped and ranges roughly between 110000 and 140000

Standard error

std_error <- sd(so_boot_distn)

5511.674

z-scores

$\text{standardized value} = \dfrac{\text{value} - \text{mean}}{\text{standard deviation}}$

$z = \dfrac{\text{sample stat} - \text{hypoth. param. value}}{\text{standard error}}$

$z = \dfrac{\$119,574.7 - \$110,000}{\$5511.67} = 1.737$

mean_comp_samp

119574.7

mean_comp_hyp <- 110000

std_error

5511.674

z_score <- (mean_comp_samp - mean_comp_hyp) / std_error

1.737171

Testing the hypothesis

Is 1.737171 a high or low number?
This is the goal of the course!

Hypothesis testing use case:

Determine whether sample statistics are close to or far away from expected (or "hypothesized" values).

Standard normal (z) distribution

Standard normal distribution: the normal distribution with mean zero, standard deviation 1.

tibble(x = seq(-4, 4, 0.01)) %>% 
  ggplot(aes(x)) +
  stat_function(fun = dnorm) +
  ylab("PDF(x)")

Density plot of the PDF for the standard normal distribution

Let's practice!

Hypothesis Testing in R