Model Uncertainty and Sample Distributions

Introduction to Linear Modeling in Python

Jason Vestuto

Data Scientist

Population Unavailable

Histogram plot of day count versus temperature bins, Daily high temperatures in August

Introduction to Linear Modeling in Python

Sample as Population Model

2 histograms, one for sample one for population, both of normalized day count versus temperature bins, with sample bin bars being about the same height as population bin bars

Introduction to Linear Modeling in Python

Sample Statistic

Plot of histogram for one sample, count of days versus temperature bins

Introduction to Linear Modeling in Python

Bootstrap Resampling

Plot of 3 histograms, one for each of three samples, each with a center offset form the others, plotted on axes of day count versus temperature bins

Introduction to Linear Modeling in Python

Resample Distribution

Plot of histogram of means, on axes sample counts versus mean temperature bins

Introduction to Linear Modeling in Python

Bootstrap in Code

# Use sample as model for population
population_model = august_daily_highs_for_2017
# Simulate repeated data acquisitions by resampling the "model"
for nr in range(num_resamples):
    bootstrap_sample = np.random.choice(population_model, size=resample_size, replace=True)
    bootstrap_means[nr] = np.mean(bootstrap_sample)
# Compute the mean of the bootstrap resample distribution
estimate_temperature = np.mean(bootstrap_means)
# Compute standard deviation of the bootstrap resample distribution
estimate_uncertainty = np.std(bootstrap_means)
Introduction to Linear Modeling in Python

Replacement

# Define the sample of notes
sample = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
# Replace = True, repeats are allowed
bootstrap_sample = np.random.choice(sample, size=4, replace=True)
print(bootstrap_sample)
C C F G
Introduction to Linear Modeling in Python

Replacement

# Replace = False
bootstrap_sample = np.random.choice(sample, size=4, replace=False)
print(bootstrap_sample)
C G A F
# Replace = True, more lengths are allowed
bootstrap_sample = np.random.choice(sample, size=16, replace=True)
print(bootstrap_sample)
C C F G C G A E F D G B B A E C
Introduction to Linear Modeling in Python

Let's practice!

Introduction to Linear Modeling in Python

Preparing Video For Download...