A/B Testing in Python
Moe Lotfy, PhD
Principal Data Science Manager


Quantitative categorization
Stable/robust against the unimportant differences
Sensitive to the important changes
Measurable within logging limitations
Non-gameable

checkout.groupby('gender')['purchased'].mean()
gender
F    0.908056
M    0.780009
Name: purchased, dtype: float64
checkout[(checkout['browser']=='chrome')|(checkout['browser']=='safari')]\
    .groupby('gender')['order_value'].mean()
gender
F    29.814161
M    30.383431
Name: order_value, dtype: float64
checkout.groupby('browser')[['order_value', 'purchased']].mean()
         order_value  purchased
browser                        
chrome     30.016625   0.839088
firefox    29.887491   0.851725
safari     30.119808   0.844337
A/B Testing in Python