Model Uncertainty and Sample Distributions

Introduzione alla modellazione lineare in Python

Jason Vestuto

Data Scientist

Population Unavailable

Histogram plot of day count versus temperature bins, Daily high temperatures in August

Introduzione alla modellazione lineare in Python

Sample as Population Model

2 histograms, one for sample one for population, both of normalized day count versus temperature bins, with sample bin bars being about the same height as population bin bars

Introduzione alla modellazione lineare in Python

Sample Statistic

Plot of histogram for one sample, count of days versus temperature bins

Introduzione alla modellazione lineare in Python

Bootstrap Resampling

Plot of 3 histograms, one for each of three samples, each with a center offset form the others, plotted on axes of day count versus temperature bins

Introduzione alla modellazione lineare in Python

Resample Distribution

Plot of histogram of means, on axes sample counts versus mean temperature bins

Introduzione alla modellazione lineare in Python

Bootstrap in Code

# Use sample as model for population
population_model = august_daily_highs_for_2017
# Simulate repeated data acquisitions by resampling the "model"
for nr in range(num_resamples):
    bootstrap_sample = np.random.choice(population_model, size=resample_size, replace=True)
    bootstrap_means[nr] = np.mean(bootstrap_sample)
# Compute the mean of the bootstrap resample distribution
estimate_temperature = np.mean(bootstrap_means)
# Compute standard deviation of the bootstrap resample distribution
estimate_uncertainty = np.std(bootstrap_means)
Introduzione alla modellazione lineare in Python

Replacement

# Define the sample of notes
sample = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
# Replace = True, repeats are allowed
bootstrap_sample = np.random.choice(sample, size=4, replace=True)
print(bootstrap_sample)
C C F G
Introduzione alla modellazione lineare in Python

Replacement

# Replace = False
bootstrap_sample = np.random.choice(sample, size=4, replace=False)
print(bootstrap_sample)
C G A F
# Replace = True, more lengths are allowed
bootstrap_sample = np.random.choice(sample, size=16, replace=True)
print(bootstrap_sample)
C C F G C G A E F D G B B A E C
Introduzione alla modellazione lineare in Python

Let's practice!

Introduzione alla modellazione lineare in Python

Preparing Video For Download...