Introduction to Statistics in Python
Maggie Matsui
Content Developer, DataCamp




Expected value: mean of a probability distribution
Expected value of a fair die roll = $(1 \times \frac{1}{6}) + (2 \times \frac{1}{6}) +(3 \times \frac{1}{6}) +(4 \times \frac{1}{6}) +(5 \times \frac{1}{6}) +(6 \times \frac{1}{6}) = 3.5$

$$P(\text{die roll}) \le 2 = ~?$$

$$P(\text{die roll}) \le 2 = 1/3$$


Expected value of uneven die roll = $(1 \times \frac{1}{6}) +(2 \times 0) +(3 \times \frac{1}{3}) +(4 \times \frac{1}{6}) +(5 \times \frac{1}{6}) +(6 \times \frac{1}{6}) = 3.67$

$$P(\text{uneven die roll}) \le 2 = ~?$$

$$P(\text{uneven die roll}) \le 2 = 1/6$$

Describe probabilities for discrete outcomes

Discrete uniform distribution

print(die)
number prob
0 1 0.166667
1 2 0.166667
2 3 0.166667
3 4 0.166667
4 5 0.166667
5 6 0.166667
np.mean(die['number'])
3.5
rolls_10 = die.sample(10, replace = True)
rolls_10
number prob
0 1 0.166667
0 1 0.166667
4 5 0.166667
1 2 0.166667
0 1 0.166667
0 1 0.166667
5 6 0.166667
5 6 0.166667
...
rolls_10['number'].hist(bins=np.linspace(1,7,7))
plt.show()


np.mean(rolls_10['number']) = 3.0

mean(die['number']) = 3.5

np.mean(rolls_100['number']) = 3.4

mean(die['number']) = 3.5

np.mean(rolls_1000['number']) = 3.48

mean(die['number']) = 3.5
As the size of your sample increases, the sample mean will approach the expected value.
| Sample size | Mean |
|---|---|
| 10 | 3.00 |
| 100 | 3.40 |
| 1000 | 3.48 |
Introduction to Statistics in Python