Customer Analytics and A/B Testing in Python
Ryan Grossman
Data Scientist, EDO
# Demographic information for our test groups test_demographics = pd.read_csv('test_demographics.csv`)
# results for our A/B test # group column: 'c' for control | 'v' for variant test_results = pd.read_csv('ab_test_results.csv')
test_results.head(n=5)
uid date purchase sku price group
0 90554036 2018-02-27 0 NaN NaN C
1 90554036 2018-02-28 0 NaN NaN C
2 90554036 2018-03-01 0 NaN NaN C
3 90554036 2018-03-02 0 NaN NaN C
4 90554036 2018-03-03 0 NaN NaN C
# Group our data by test vs. control test_results_grpd = test_results.groupby( by=['group'], as_index=False)
# Count the unique users in each group test_results_grpd.uid.count()
group uid
0 C 48236
1 V 49867
# Group our test data by demographic breakout
test_results_demo = test_results.merge(
test_demo, how='inner', on='uid')
test_results_grpd = test_results_demo.groupby(
by= ['country','gender', 'device', 'group' ],
as_index=False)
test_results_grpd.uid.count()
country gender device group uid
BRA F and C 5070
BRA F and V 4136
BRA F iOS C 3359
BRA F iOS V 2817
...
# Find the count of payawall viewer and purchases in each group test_results_summary = test_results_demo.groupby( by=['group'], as_index=False ).agg({'purchase': ['count', 'sum']})
# Calculate our paywall conversion rate by group test_results_summary['conv'] = (test_results_summary.purchase['sum'] / test_results_summary.purchase['count']) test_results_summary
group purchase conv
count sum
0 C 48236 1657 0.034351
1 V 49867 2094 0.041984
than the one we observed
Low p-values
Customer Analytics and A/B Testing in Python