Non-parametric tests

Foundations of Inference in Python

Paul Savala

Assistant Professor of Mathematics

Non-parametric tests

  • May have assumptions
  • Don't require normality
  • Applies to a broad range of data
  • Sometimes less powerful
  • Useful for ranked order (e.g. star rankings)

Four histograms, each with differently shaped data.

Foundations of Inference in Python

Parametric tests

 

  • Independent sample t-test
  • ANOVA
  • Paired sample t-test
  • Pearson's R

Non-parametric

 

  • Wilcoxon-Mann-Whitney U test
  • Kruskal-Wallis test
  • Mood's median test
  • Kendall's tau
Foundations of Inference in Python

Mood's median test

Compares medians from two paired measurements

Five rows of a DataFrame with different universities, and two columns showing university scores.

Likely not normally distributed

s, p_value, m, table = stats.median_test(df['thew_score'], df['arw_score'])
Foundations of Inference in Python

Mood's median test

print(p_value < 0.05)
TRUE
  • Conclusion: Different median rankings
  • t-tests assume normality (Mood's median test does not)
  • Valid inference only when data and assumptions match
  • Use the right tool for the job!
Foundations of Inference in Python

Kendall's tau

  • Values between -1 and 1
  • $\tau = -1$: Complete disagreement
  • $\tau = 0$: No correlation
  • $\tau = 1$: Complete agreement

Five rows of a DataFrame with different universities, showing university rankings for each university.

tau, p_value = stats.kendalltau(
    df['thew_rank'], 
    df['arw_rank'])

print(tau, p_value < 0.05)
0.651, TRUE
Foundations of Inference in Python

Let's practice!

Foundations of Inference in Python

Preparing Video For Download...