Non-parametric statistical tests

A/B Testing in Python

Moe Lotfy, PhD

Principal Data Science Manager

Parametric tests assumptions

Random sampling
- Data is randomly sampled from the population.
- Investigate the data collection/sampling process.
Independence
- Each observation/data point is independent.
- Not accounting for dependencies inflates error rates.
Normality
- Normally distributed data.
- Large "enough" sample size.
  - Two sample t-test n >= 30 in each group.
  - Two sample proportions test: >=10 successes and >=10 failures in each group.

Mann-Whitney U test

Non-parametric test for statistical significance
Determines if two independent samples have the same parent distribution
Rank sum test
Unpaired data

Mann-Whitney U test in python

# Calculate the mean and count of time on page by variant
print(checkout.groupby('checkout_page')['time_on_page'].agg({'mean', 'count'}))

                    mean  count
checkout_page                  
A              44.668527   3000
B              42.723772   3000
C              42.223772   3000

# Set random seed for repeatability 
np.random.seed(40)
# Take a random sample of size 25 from each variant
ToP_samp_A = checkout[checkout['checkout_page'] == 'A'].sample(25)['time_on_page']
ToP_samp_B = checkout[checkout['checkout_page'] == 'B'].sample(25)['time_on_page']

Mann-Whitney U test in python

# Run a Mann-Whitney U test
mwu_test = pingouin.mwu(x=ToP_samp_A,
                        y=ToP_samp_B,
                        alternative='two-sided')
# Print the test results
print(mwu_test)

     U-val alternative     p-val     RBC    CLES
MWU  441.0   two-sided  0.013007 -0.4112  0.7056

Chi-square test of independence

Free from parametric test assumptions
Tests whether two or more categorical variables are independent
- Null hypothesis: The variables are independent.
- Alternative hypothesis: The variables are not independent.

Chi-square test in python

Homepage signup rates A/B test

Null: There is no significant difference in signup rates between landing page designs C and D

Alternative: There is no significant difference in signup rates between them

# Calculate the number of users in groups C and D
n_C = homepage[homepage['landing_page'] == 'C']['user_id'].nunique()
n_D = homepage[homepage['landing_page'] == 'D']['user_id'].nunique()

# Compute unique signups in each group
signup_C = homepage[homepage['landing_page'] == 'C'].groupby('user_id')['signup'].max().sum()
no_signup_C = n_C - signup_C
signup_D = homepage[homepage['landing_page'] == 'D'].groupby('user_id')['signup'].max().sum()
no_signup_D = n_D - signup_D

Chi-square test in python

# Create the signups table
table = [[signup_C, no_signup_C], [signup_D, no_signup_D]]
print('Group C signup rate:',round(signup_C/n_C,3))
print('Group D signup rate:',round(signup_D/n_D,3))

# Calculate p-value
print('p-value=',stats.chi2_contingency(table,correction=False)[1])

Group C signup rate: 0.064
Group D signup rate: 0.048
p-value= 0.009165

Let's practice!

A/B Testing in Python