Assumptions and normal distributions

Performing Experiments in Python

Luke Hayden

Instructor

Summary stats

Mean

  • Sum divided by count

Median

  • Half of values fall above and below the median

Mode

  • Value that occurs most often

Standard deviation

  • Measure of variability
Performing Experiments in Python

Normal distribution

Density plot of a perfect normal distribution

Performing Experiments in Python

Sample distribution

print(p9.ggplot(countrydata)+ p9.aes(x= 'Life_exp')+ p9.geom_density(alpha=0.5))

Density plot of life expectancy per country

Performing Experiments in Python

Accessing summary stats

Mean

print(countrydata.Life_exp.mean())
73.68201058201058

Median

print(countrydata.Life_exp.median())
76.0

Mode

print(countrydata.Life_exp.mode())
78.4
Performing Experiments in Python

Normal distribution

Density plot of a perfect normal distribution

Performing Experiments in Python

Q-Q (quantile-quantile) plot

Normal probability plot

Use

  • Distribution fit expected (normal) distribution?
  • Graphical method to assess normality

Basis

  • Compare quantiles of data with theoretical quantiles predicted under distribution

Quantile-quantile plot for a perfect normal distribution

Performing Experiments in Python

Creating a Q-Q plot

from scipy import stats
import plotnine as p9

tq = stats.probplot(countrydata.Life_exp, dist="norm")

df = pd.DataFrame(data = {'Theoretical Quantiles': tq[0][0], 
                         "Ordered Values": countrydata.Life_exp.sort_values() })

print(p9.ggplot(df)+ p9.aes('Theoretical Quantiles', "Ordered Values") +p9.geom_point())
Performing Experiments in Python

Q-Q plot for sample

Distribution Density plot of life expectancy per country

Q-Q plot

QQ plot of life expectancy per country

Performing Experiments in Python

Let's practice!

Performing Experiments in Python

Preparing Video For Download...