Introduction to Statistics in Python
Maggie Matsui
Content Developer, DataCamp
Expected value: mean of a probability distribution
Expected value of a fair die roll = $(1 \times \frac{1}{6}) + (2 \times \frac{1}{6}) +(3 \times \frac{1}{6}) +(4 \times \frac{1}{6}) +(5 \times \frac{1}{6}) +(6 \times \frac{1}{6}) = 3.5$
$$P(\text{die roll}) \le 2 = ~?$$
$$P(\text{die roll}) \le 2 = 1/3$$
Expected value of uneven die roll = $(1 \times \frac{1}{6}) +(2 \times 0) +(3 \times \frac{1}{3}) +(4 \times \frac{1}{6}) +(5 \times \frac{1}{6}) +(6 \times \frac{1}{6}) = 3.67$
$$P(\text{uneven die roll}) \le 2 = ~?$$
$$P(\text{uneven die roll}) \le 2 = 1/6$$
Describe probabilities for discrete outcomes
Discrete uniform distribution
print(die)
number prob
0 1 0.166667
1 2 0.166667
2 3 0.166667
3 4 0.166667
4 5 0.166667
5 6 0.166667
np.mean(die['number'])
3.5
rolls_10 = die.sample(10, replace = True)
rolls_10
number prob
0 1 0.166667
0 1 0.166667
4 5 0.166667
1 2 0.166667
0 1 0.166667
0 1 0.166667
5 6 0.166667
5 6 0.166667
...
rolls_10['number'].hist(bins=np.linspace(1,7,7))
plt.show()
np.mean(rolls_10['number'])
= 3.0
mean(die['number'])
= 3.5
np.mean(rolls_100['number'])
= 3.4
mean(die['number'])
= 3.5
np.mean(rolls_1000['number'])
= 3.48
mean(die['number'])
= 3.5
As the size of your sample increases, the sample mean will approach the expected value.
Sample size | Mean |
---|---|
10 | 3.00 |
100 | 3.40 |
1000 | 3.48 |
Introduction to Statistics in Python