Non-parametric ANOVA and unpaired t-tests

Hypothesis Testing in Python

James Chapman

Curriculum Manager, DataCamp

Wilcoxon-Mann-Whitney test

  • Also know as the Mann Whitney U test
  • A t-test on the ranks of the numeric input
  • Works on unpaired data
Hypothesis Testing in Python

Wilcoxon-Mann-Whitney test setup

age_vs_comp = stack_overflow[['converted_comp', 'age_first_code_cut']]
age_vs_comp_wide = age_vs_comp.pivot(columns='age_first_code_cut',
                                     values='converted_comp')
age_first_code_cut      adult     child
0                     77556.0       NaN
1                         NaN   74970.0
2                         NaN  594539.0
...                       ...       ...
2258                      NaN   97284.0
2259                      NaN   72000.0
2260                      NaN  180000.0

[2261 rows x 2 columns]
Hypothesis Testing in Python

Wilcoxon-Mann-Whitney test

alpha=0.01
import pingouin
pingouin.mwu(x=age_vs_comp_wide['child'], 
             y=age_vs_comp_wide['adult'],
             alternative='greater')
        U-val alternative         p-val       RBC      CLES
MWU  744365.5     greater  1.902723e-19 -0.222516  0.611258
Hypothesis Testing in Python

Kruskal-Wallis test

Kruskal-Wallis test is to Wilcoxon-Mann-Whitney test as ANOVA is to t-test

alpha=0.01
pingouin.kruskal(data=stack_overflow, 
                 dv='converted_comp', 
                 between='job_sat')
          Source  ddof1          H         p-unc
Kruskal  job_sat      4  72.814939  5.772915e-15
Hypothesis Testing in Python

Let's practice!

Hypothesis Testing in Python

Preparing Video For Download...