Introduction to Linear Modeling in Python
Jason Vestuto
Data Scientist
Question: Is our effect due a relationship or due to random chance?
Answer: check the Null Hypothesis.
# Group into early and late times
group_short = sample_distances[times < 5]
group_long = sample_distances[times > 5]
# Resample distributions
resample_short = np.random.choice(group_short, size=500, replace=True)
resample_long = np.random.choice(group_long, size=500, replace=True)
# Test Statistic
test_statistic = resample_long - resample_short
# Effect size as mean of test statistic distribution
effect_size = np.mean(test_statistic)
# Concatenate and Shuffle
shuffle_bucket = np.concatenate((group_short, group_long))
np.random.shuffle(shuffle_bucket)
# Split in the middle
slice_index = len(shuffle_bucket)//2
shuffled_half1 = shuffle_bucket[0:slice_index]
shuffled_half2 = shuffle_bucket[slice_index+1:]
# Resample shuffled populations
shuffled_sample1 = np.random.choice(shuffled_half1, size=500, replace=True)
shuffled_sample2 = np.random.choice(shuffled_half2, size=500, replace=True)
# Recompute effect size
shuffled_test_statistic = shuffled_sample2 - shuffled_sample1
effect_size = np.mean(shuffled_test_statistic)
Introduction to Linear Modeling in Python