Evaluating distribution choices

Monte Carlo Simulations in Python

Izzy Weber

Curriculum Manager, DataCamp

Choosing variable probability distributions

  1. Gain an intuitive understanding of data and available probability distributions
  2. Use Maximum Likelihood Estimation (MLE) to compare candidate distributions
  3. Use Kolmogorov-Smirnov test to evaluate goodness of fit of probability distributions
    • Quantifies distance between the empirical distribution of the data and the theoretical candidate probability distribution
    • Use scipy.stats.kstest() to calculate
Monte Carlo Simulations in Python

Evaluating choice of distribution: age

results = []

list_of_dists = ["laplace", "norm", "expon"]
for i in list_of_dists: dist = getattr(st, i)
param = dist.fit(dia["age"])
result = st.kstest(dia["age"], i, args=param)
print(result)

Results for Laplace, normal, and exponential distributions in that order:

KstestResult(statistic=0.09511179937112832, pvalue=0.0006239579389182981)
KstestResult(statistic=0.0615913626181368, pvalue=0.06703225234359811)
KstestResult(statistic=0.2536037941921312, pvalue=1.5202547969084796e-25)
Monte Carlo Simulations in Python

Evaluating choice of distribution: age

Results for Laplace, normal, and exponential distributions in that order:

KstestResult(statistic=0.09511179937112832, pvalue=0.0006239579389182981)
KstestResult(statistic=0.0615913626181368, pvalue=0.06703225234359811)
KstestResult(statistic=0.2536037941921312, pvalue=1.5202547969084796e-25)
Monte Carlo Simulations in Python

Evaluating choice of distribution: tc blood serum

results = []
list_of_dists = ["laplace", "norm", "expon"]

for i in list_of_dists: dist = getattr(st, i) param = dist.fit(dia["tc"]) result = st.kstest(dia["tc"], i, args=param) print(result)

Results for Laplace, normal, and exponential distributions in that order:

KstestResult(statistic=0.06435779928393615, pvalue=0.04915329841106708)
KstestResult(statistic=0.051165295747227724, pvalue=0.19085587687385897)
KstestResult(statistic=0.3318461436889846, pvalue=7.018486943525e-44)
Monte Carlo Simulations in Python

Let's practice!

Monte Carlo Simulations in Python

Preparing Video For Download...