Blocking and randomization

Performing Experiments in Python

Luke Hayden

Instructor

Making comparisons

Compare like with like

Only variable of interest should differ between groups

Remove sources of variation

See variation of interest

Random sampling

Simple way to assign to treatments

import pandas as pd
from scipy import stats

seed= 1916
subset_A = df[df.Sample == "A"].sample(n= 30, random_state= seed)
subset_B = df[df.Sample == "B"].sample(n= 30, random_state= seed)

t_result = stats.ttest_ind(subset_A.value, subset_B.value)

Other sources of variation

Example

Two potato varieties: Roosters & Records
Two fertilizers: A & B
Variety could be a confounder

Density plot of potato production in relation to potato variety and fertilizer type

Blocking

Solution to confounding
Control for confounding by balancing with respect to other variable

Example

Equal proportions of each variety treated with each fertilizer

Design

Variety	Fertilizer A	Fertilizer B
Records	10	10
Roosters	10	10

Implementing a blocked design

import pandas as pd

block1 = df[(df.Variety == "Roosters") ].sample(n=15, random_state= seed)
block2 = df[(df.Variety == "Records") ].sample(n=15, random_state= seed)

fertAtreatment = pd.concat([block1, block2])

Paired samples

Special case

Control for individual variation
Increase statistical power by reducing noise

Example

Yield of 5 fields before/after change of fertilizer

2017 yield (tons/hectare)	2018 yield (tons/hectare)
60.2	63.2
12	15.6
13.8	14.8
91.8	96.7
50	53

Implementing a paired t-test

from scipy import stats

yields2018= [60.2, 12, 13.8, 91.8, 50]
yields2019 = [63.2, 15.6, 14.8, 96.7, 53]

ttest = stats.ttest_rel(yields2018,yields2019)

print(ttest[1])

p-value:

0.007894143467973484

Let's practice!

Performing Experiments in Python