Linear regressions and pairs bootstrap

Case Studies in Statistical Thinking

Justin Bois

Lecturer, Caltech

Bacterial growth

1 Images courtesy of Jin Park and Michael Elowitz, Caltech
Case Studies in Statistical Thinking

Bacterial growth

Case Studies in Statistical Thinking
_ = plt.semilogy(t, bac_area, marker='.', linestyle='none')
_ = plt.xlabel('time (hr)')
_ = plt.ylabel('area (sq. µm)')
plt.show()

Case Studies in Statistical Thinking

Linear regression with np.polyfit()

slope, intercept = np.polyfit(t, bac_area, 1)

t_theor = np.array([0, 14])
bac_area_theor = slope * t_theor + intercept

_ = plt.plot(t, bac_area, marker='.', linestyle='none')
_ = plt.plot(t_theor, bac_area_theor)
_ = plt.xlabel('time (hr)')
_ = plt.ylabel('area (sq. µm)')
plt.show()
Case Studies in Statistical Thinking

Regression of bacterial growth

Case Studies in Statistical Thinking

Semilog-linear regression with np.polyfit()

slope, intercept = np.polyfit(t, np.log(bac_area), 1)

t_theor = np.array([0, 14])
bac_area_theor = np.exp(slope * t_theor + intercept)

_ = plt.semilogy(t, bac_area, marker='.', linestyle='none')
_ = plt.semilogy(t_theor, bac_area_theor)
_ = plt.xlabel('time (hr)')
_ = plt.ylabel('area (sq. µm)')
plt.show()
Case Studies in Statistical Thinking

Regression of bacterial growth

Case Studies in Statistical Thinking

Pairs bootstrap

  • Resample data in pairs
  • Compute slope and intercept from resampled data
  • Each slope and intercept is a bootstrap replicate
  • Compute confidence intervals from percentiles of bootstrap replicates
Case Studies in Statistical Thinking

Pairs bootstrap

# Draw 10000 pairs bootstrap reps
slope_reps, int_reps = dcst.draw_bs_pairs_linreg(
  x_data, y_data, size=10000
)

# Compute 95% confidence interval of slope slope_conf_int = np.percentile(slope_reps, [2.5, 97.5])
Case Studies in Statistical Thinking

Let's practice!

Case Studies in Statistical Thinking

Preparing Video For Download...