Bootstrap confidence intervals

Studi Kasus dalam Pemikiran Statistik

Justin Bois

Lecturer, Caltech

EDA is the first step

"Exploratory data analysis can never be the whole story, but nothing else can serve as a foundation stone, as the first step."

--John Tukey

Studi Kasus dalam Pemikiran Statistik

Active bout length ECDFs

1 Data courtesy of Avni Gandhi, Grigorios Oikonomou, and David Prober, Caltech
Studi Kasus dalam Pemikiran Statistik

Optimal parameter value

  • Optimal parameter value: The value of the parameter of a probability distribution that best describes the data

  • Optimal parameter for the Exponential distribution: Computed from the mean of the data
Studi Kasus dalam Pemikiran Statistik
np.mean(nuclear_incident_times)
87.140350877192986

1 Data source: Wheatley, Sovacool, and Sornette, Nuclear Events Database
Studi Kasus dalam Pemikiran Statistik

Bootstrap sample

A resampled array of the data

# Resample nuclear_incident_times with replacement
bs_sample = np.random.choice(
  nuclear_incident_times,
  replace=True,
  size=len(inter_times)
)
Studi Kasus dalam Pemikiran Statistik

Bootstrap replicates

1 Data source: Wheatley, Sovacool, and Sornette, Nuclear Events Database
Studi Kasus dalam Pemikiran Statistik

Bootstrap replicates

1 Data source: Wheatley, Sovacool, and Sornette, Nuclear Events Database
Studi Kasus dalam Pemikiran Statistik

Bootstrap replicates

1 Data source: Wheatley, Sovacool, and Sornette, Nuclear Events Database
Studi Kasus dalam Pemikiran Statistik

Bootstrap replicates

1 Data source: Wheatley, Sovacool, and Sornette, Nuclear Events Database
Studi Kasus dalam Pemikiran Statistik

Bootstrap replicates

Bootstrap replicate: A statistic computed from a bootstrap sample

Studi Kasus dalam Pemikiran Statistik

dcst.draw_bs_reps()

Function to draw bootstrap replicates from a dataset

# Draw 10000 replicates of the mean from
# nuclear_incident_times
bs_reps = dcst.draw_bs_reps(
  nuclear_incident_times, np.mean, size=10000
)
Studi Kasus dalam Pemikiran Statistik

The bootstrap confidence interval

1 Data source: Wheatley, Sovacool, and Sornette, Nuclear Events Database
Studi Kasus dalam Pemikiran Statistik

The bootstrap confidence interval

If we repeated measurements over and over again, p% of the observed values would lie within the p% confidence interval

Studi Kasus dalam Pemikiran Statistik

The bootstrap confidence interval

np.percentile(bs_reps, [2.5, 97.5])
array([  73.31505848,  102.39181287])
Studi Kasus dalam Pemikiran Statistik

Let's practice!

Studi Kasus dalam Pemikiran Statistik

Preparing Video For Download...