Random number generators and hacker statistics

Statistical Thinking in Python (Part 1)

Justin Bois

Teaching Professor at the California Institute of Technology

Hacker statistics

  • Uses simulated repeated measurements to compute probabilities.
Statistical Thinking in Python (Part 1)

  ch3-2.004.png

1 Image: artist unknown
Statistical Thinking in Python (Part 1)

ch3-2.005.png

1 Image: Heritage Auction
Statistical Thinking in Python (Part 1)

Simulating coin flips

ch3-2.010.png

Statistical Thinking in Python (Part 1)

Bernoulli trials

  • An experiment that has two options, "success" (True) and "failure" (False).
Statistical Thinking in Python (Part 1)

The np.random module

import numpy as np
rng = np.random.default_rng()

rng
Generator(PCG64) at 0x7F9433D38120
Statistical Thinking in Python (Part 1)

Random number seed

  • Integer fed into random number generating algorithm
  • Manually seed random number generator (only) if you need reproducibility
  • Specified using rng = np.random.default_rng(seed)
Statistical Thinking in Python (Part 1)

Simulating 4 coin flips

rng = np.random.default_rng(42)

random_numbers = rng.random(size=4)
random_numbers
array([0.77395605, 0.43887844, 0.85859792, 0.69736803])
heads = random_numbers < 0.5
heads
array([False,  True, False, False])
np.sum(heads)
1
Statistical Thinking in Python (Part 1)

Simulating 4 coin flips

n_all_heads = 0  # Initialize number of 4-heads trials
for _ in range(10000):
     heads = np.random.random(size=4) < 0.5
     n_heads = np.sum(heads)
     if n_heads == 4:
         n_all_heads += 1

n_all_heads / 10000
0.0607
Statistical Thinking in Python (Part 1)

Hacker stats probabilities

  • Determine how to simulate data
  • Simulate many many times
  • Probability is approximately fraction of trials with the outcome of interest
Statistical Thinking in Python (Part 1)

Let's practice!

Statistical Thinking in Python (Part 1)

Preparing Video For Download...