Bootstrap hypothesis tests

Statistical Thinking in Python (Part 2)

Justin Bois

Lecturer at the California Institute of Technology

Pipeline for hypothesis testing

  • Clearly state the null hypothesis
  • Define your test statistic
  • Generate many sets of simulated data assuming the null hypothesis is true
  • Compute the test statistic for each simulated data set
  • The p-value is the fraction of your simulated data sets for which the test statistic is at least as extreme as for the real data
Statistical Thinking in Python (Part 2)

Michelson and Newcomb: speed of light pioneers

ch3-3.008.png

1 Michelson image: public domain, Smithsonian 2 Newcomb image: US Library of Congress
Statistical Thinking in Python (Part 2)

Michelson and Newcomb: speed of light pioneers

ch3-3.010.png

1 Michelson image: public domain, Smithsonian 2 Newcomb image: US Library of Congress
Statistical Thinking in Python (Part 2)

The data we have

ch3-3.011.png

1 Data: Michelson, 1880
Statistical Thinking in Python (Part 2)

Null hypothesis

  • The true mean speed of light in Michelson’s experiments was actually Newcomb's reported value
Statistical Thinking in Python (Part 2)

Shifting the Michelson data

newcomb_value = 299860  # km/s
michelson_shifted = michelson_speed_of_light \\
           - np.mean(michelson_speed_of_light) + newcomb_value

ch3-3.019.png

Statistical Thinking in Python (Part 2)

Calculating the test statistic

def diff_from_newcomb(data, newcomb_value=299860):
    return np.mean(data) - newcomb_value
diff_obs = diff_from_newcomb(michelson_speed_of_light)

diff_obs
-7.5999999999767169
Statistical Thinking in Python (Part 2)

Computing the p-value

bs_replicates = draw_bs_reps(michelson_shifted,
                             diff_from_newcomb, 10000)

p_value = np.sum(bs_replicates <= diff_observed) / 10000
p_value
0.16039999999999999
Statistical Thinking in Python (Part 2)

 

One sample test

- Compare one set of data to a single number

 

Two sample test

- Compare two sets of data
Statistical Thinking in Python (Part 2)

Let's practice!

Statistical Thinking in Python (Part 2)

Preparing Video For Download...