Standard errors and the Central Limit Theorem

Sampling in Python

James Chapman

Curriculum Manager, DataCamp

Sampling distribution of mean cup points

Sample size: 5 A histogram of the approximate sampling distribution of mean cup points with a sample size of five.

Sample size: 20 A histogram of the approximate sampling distribution of mean cup points with a sample size of 20.

Sample size: 80 A histogram of the approximate sampling distribution of mean cup points with a sample size of 80.

Sample size: 320 A histogram of the approximate sampling distribution of mean cup points with a sample size of 320.

Averages of independent samples have approximately normal distributions.

As the sample size increases,

coffee_ratings['total_cup_points'].mean()

82.15120328849028

Use np.mean() on each approximate sampling distribution:

coffee_ratings['total_cup_points'].std(ddof=0)

2.685858187306438

Sample size	Std dev sample mean	Calculation	Result
5	`1.1886358227738543`	`2.685858187306438 / sqrt(5)`	`1.201`
20	`0.5940321141669805`	`2.685858187306438 / sqrt(20)`	`0.601`
80	`0.2934024263916487`	`2.685858187306438 / sqrt(80)`	`0.300`
320	`0.13095083089190876`	`2.685858187306438 / sqrt(320)`	`0.150`

Sampling in Python