Prior belief

Bayesian Data Analysis in Python

Michal Oleszak

Machine Learning Engineer

Prior distribution

  • Prior distribution reflects what we know about the parameter before observing any data:
    • nothing   →   uniform distribution (all values equally likely)
    • old posterior   →   can be updated with new data

 

  • One can choose any probability distribution as a prior to include external info in the model:
    • expert opinion
    • common knowledge
    • previous research
    • subjective belief
Bayesian Data Analysis in Python

Prior's impact

Two density plots overlaid onto each other, one for the prior distribution, another for the posterior. The prior is uniform, the posterior is bell-shaped, slightly taller.

Two density plots overlaid onto each other, one for the prior distribution, another for the posterior. The prior is uniform, the posterior is bell-shaped, slightly taller and skewed to the right.

Two density plots overlaid onto each other, one for the prior distribution, another for the posterior. The prior is uniform, the posterior is bell-shaped, much taller.

Two density plots overlaid onto each other, one for the prior distribution, another for the posterior. The prior is uniform, the posterior is bell-shaped, very much taller.

Bayesian Data Analysis in Python

Prior distribution

  • Prior distribution chosen before we see the data.
  • Prior choice can impact posterior results (especially with little data).
  • To avoid cherry-picking, prior choices should be:
    • clearly stated,
    • explainable: based on previous research, sensible assumptions, expert opinion, etc.
Bayesian Data Analysis in Python

Choosing the right prior

Our prior belief: heads less likely

A bell-shaped, but skewed to the left density plot, peaking at around 0.25.

Some choices are better than others!

 

A bell-shaped, but skewed to the left density plot, peaking at around 0.25, slightly narrower than the previous plot.

Bayesian Data Analysis in Python

Conjugate priors

  • Some priors, multiplied with specific likelihoods, yield known posteriors.
  • They are known as conjugate priors.
  • In the case of coin tossing:
    • if we choose a prior Beta(a, b),
    • then the posterior is Beta(#heads + a, #tosses - #heads + b)
  • We can sample from the posterior using numpy.
  • get_heads_prob() from Chapter 1:
    def get_heads_prob(tosses):
      num_heads = np.sum(tosses)
      # prior: Beta(1,1)
      return np.random.beta(num_heads + 1, len(tosses) - num_heads + 1, 1000)
    
Bayesian Data Analysis in Python

Two ways to get the posterior

Simulation

  • If posterior is known, we can sample from it using numpy:
    draws = np.random.beta(2, 4, 1000)
    
  • Outcome: an array of 1000 posterior draws:
    array([0.05941031, ..., 0.70015975])
    
  • Can be plotted with
    sns.kdeplot(draws)
    

Calculation

  • If posterior is not known, we can calculate it using grid approximation.
  • Outcome: posterior probability for each grid element:
           head_prob  posterior_prob
    0           0.00        0.009901
    1           0.01        0.003624
               ...           ...
    10199       0.99        0.003624
    10200       1.00        0.009901
    
  • Can be plotted with
    sns.lineplot(df["head_prob"], df["posterior_prob"])
    
Bayesian Data Analysis in Python

Let's practice working with priors!

Bayesian Data Analysis in Python

Preparing Video For Download...