More probability distributions

Introduction to Statistics in Python

Maggie Matsui

Content Developer, DataCamp

Exponential distribution

  • Probability of time between Poisson events

  • Examples

    • Probability of > 1 day between adoptions
    • Probability of < 10 minutes between restaurant arrivals
    • Probability of 6-8 months between earthquakes
  • Also uses lambda (rate)

  • Continuous (time)

Introduction to Statistics in Python

Customer service requests

  • On average, one customer service ticket is created every 2 minutes
    • $\lambda$ = 0.5 customer service tickets created each minute

Exponential distribution with lambda = 0.5

Introduction to Statistics in Python

Lambda in exponential distribution

3 Exponential distributions with lambda = 0.5, lambda = 1, and lambda = 1.5

Introduction to Statistics in Python

Expected value of exponential distribution

In terms of rate (Poisson):

  • $\lambda$ = $0.5 \text{ requests}$ per minute

 

In terms of time between events (exponential):

  • $1/\lambda$ = 1 request per $2 \text{ minutes}$
  • $1/0.5$ = $2$
Introduction to Statistics in Python

How long until a new request is created?

 

from scipy.stats import expon
  • scale = $1/\lambda$ = $1/0.5$ = $2$

$P(\text{wait} < \text{1 min})$ =

expon.cdf(1, scale=2)
0.3934693402873666

$P(\text{wait} > \text{4 min})$ =

1- expon.cdf(4, scale=2)
0.1353352832366127

$P(\text{1 min} < \text{wait} < \text{4 min})$ =

expon.cdf(4, scale=2) - expon.cdf(1, scale=2)
0.4711953764760207
Introduction to Statistics in Python

(Student's) t-distribution

  • Similar shape as the normal distribution

t-distribution and normal distribution plotted on same axes

Introduction to Statistics in Python

Degrees of freedom

  • Has parameter degrees of freedom (df) which affects the thickness of the tails
    • Lower df = thicker tails, higher standard deviation
    • Higher df = closer to normal distribution

3 t distributions with df = 1, df = 5, and df = 10

Introduction to Statistics in Python

Log-normal distribution

  • Variable whose logarithm is normally distributed

  • Examples:

    • Length of chess games
    • Adult blood pressure
    • Number of hospitalizations in the 2003 SARS outbreak

3 log-normal distributions with sd = 0.5, sd = 1, and sd = 1.5

Introduction to Statistics in Python

Let's practice!

Introduction to Statistics in Python

Preparing Video For Download...