Pairs bootstrap

Statistical Thinking in Python (Part 2)

Justin Bois

Lecturer at the California Institute of Technology

Nonparametric inference

  • Make no assumptions about the model or probability distribution underlying the data
Statistical Thinking in Python (Part 2)

2008 US swing state election results

ch2-3.004.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 2)

Pairs bootstrap for linear regression

  • Resample data in pairs
  • Compute slope and intercept from resampled data
  • Each slope and intercept is a bootstrap replicate
  • Compute confidence intervals from percentiles of bootstrap replicates
Statistical Thinking in Python (Part 2)

Generating a pairs bootstrap sample

np.arange(7)
array([0, 1, 2, 3, 4, 5, 6])
inds = np.arange(len(total_votes))

bs_inds = np.random.choice(inds, len(inds))
bs_total_votes = total_votes[bs_inds] bs_dem_share = dem_share[bs_inds]
Statistical Thinking in Python (Part 2)

Computing a pairs bootstrap replicate

bs_slope, bs_intercept = np.polyfit(bs_total_votes, 
                                    bs_dem_share, 1)

bs_slope, bs_intercept
(3.9053605692223672e-05, 40.387910131803025)
np.polyfit(total_votes, dem_share, 1)  # fit of original
array([  4.03707170e-05,   4.01139120e+01])
Statistical Thinking in Python (Part 2)

2008 US swing state election results

ch2-3.022.png

1 Data retrieved from Data.gov (https://www.data.gov/)
Statistical Thinking in Python (Part 2)

Let's practice!

Statistical Thinking in Python (Part 2)

Preparing Video For Download...