How much is an avocado?

Bayesian Data Analysis in Python

Michal Oleszak

Machine Learning Engineer

The Avocado, Inc.

Avocado fruits.

Bayesian Data Analysis in Python

Case study: estimating price elasticity

Goal: estimate price elasticity of avocados and optimize the price

(price elasticity = impact of the change in price on the sales volume)

  1. Fit a Bayesian regression model.
  2. Inspect the model to verify its correctness.
  3. Predict sales volume for different prices.
  4. Propose the profit-maximizing price and the associated uncertainty.
Bayesian Data Analysis in Python

Avocado data

print(avocado)
           date  price      volume  type_organic
0    2015-01-04   0.95  313.242777             0
1    2015-01-11   1.01  290.635427             0
2    2015-01-18   1.03  290.434588             0
3    2015-01-25   1.04  284.703108             0
..          ...    ...         ...           ...
334  2018-03-04   1.52   16.344308             1
335  2018-03-11   1.52   16.642349             1
336  2018-03-18   1.54   16.758042             1
337  2018-03-25   1.55   15.599672             1
1 Data source: https://www.kaggle.com/neuromusic/avocado-prices
Bayesian Data Analysis in Python

Priors in pymc3

formula = "num_bikes ~ temp + work_day + wind_speed"

with pm.Model() as model:

    pm.GLM.from_formula(formula, data=bikes)
    trace = pm.sample(draws=1000, tune=500)
Bayesian Data Analysis in Python

Priors in pymc3

formula = "num_bikes ~ temp + work_day + wind_speed"

with pm.Model() as model:
    priors = {"wind_speed": pm.Normal.dist(mu=-5)}
    pm.GLM.from_formula(formula, data=bikes, priors=priors)
    trace = pm.sample(draws=1000, tune=500)
Bayesian Data Analysis in Python

Extracting draws from trace

temp_draws = trace.get_values("temp")

print(temp_draws)
array([6.8705346, 6.7421152, 6.7393061, ..., 5.966574 , 6.1274128, 6.7149277])
Bayesian Data Analysis in Python

What you will need

Model fitting:

  • pm.Model()
  • pm.GLM.from_formula()
  • pm.sample()
  • pm.Normal()

Visualization:

  • pm.forestplot()
  • pm.traceplot()

 

Making predictions:

  • pm.fast_sample_posterior_predictive()

 

Inference:

  • az.hdi()
Bayesian Data Analysis in Python

Let's put what you've learned to practice!

Bayesian Data Analysis in Python

Preparing Video For Download...