Standard errors and the Central Limit Theorem

Sampling in Python

James Chapman

Curriculum Manager, DataCamp

Sampling distribution of mean cup points

Sample size: 5 A histogram of the approximate sampling distribution of mean cup points with a sample size of five.

Sample size: 20 A histogram of the approximate sampling distribution of mean cup points with a sample size of 20.

Sample size: 80 A histogram of the approximate sampling distribution of mean cup points with a sample size of 80.

Sample size: 320 A histogram of the approximate sampling distribution of mean cup points with a sample size of 320.

Sampling in Python

Consequences of the central limit theorem

 

Averages of independent samples have approximately normal distributions.

 

As the sample size increases,

  • The distribution of the averages gets closer to being normally distributed

  • The width of the sampling distribution gets narrower

Sampling in Python

Population & sampling distribution means

coffee_ratings['total_cup_points'].mean()
82.15120328849028

Use np.mean() on each approximate sampling distribution:

Sample size Mean sample mean
5 82.18420719999999
20 82.1558634
80 82.14510154999999
320 82.154017925
Sampling in Python

Population & sampling distribution standard deviations

coffee_ratings['total_cup_points'].std(ddof=0)
2.685858187306438

 

  • Specify ddof=0 when calling .std() on populations
  • Specify ddof=1 when calling np.std() on samples or sampling distributions
Sample size Std dev sample mean
5 1.1886358227738543
20 0.5940321141669805
80 0.2934024263916487
320 0.13095083089190876
Sampling in Python

Population mean over square root sample size

Sample size Std dev sample mean Calculation Result
5 1.1886358227738543 2.685858187306438 / sqrt(5) 1.201
20 0.5940321141669805 2.685858187306438 / sqrt(20) 0.601
80 0.2934024263916487 2.685858187306438 / sqrt(80) 0.300
320 0.13095083089190876 2.685858187306438 / sqrt(320) 0.150
Sampling in Python

Standard error

  • Standard deviation of the sampling distribution
  • Important tool in understanding sampling variability
Sampling in Python

Let's practice!

Sampling in Python

Preparing Video For Download...