Bayesian Linear Regression

Bayesian Regression Modeling with rstanarm

Jake Thompson

Psychometrician, ATLAS, University of Kansas

Why use Bayesian methods?

P-values make inferences about the probability of data, not parameter values
Posterior distribution: combination of likelihood and prior
- Sample the posterior distribution
- Summarize the sample
- Use the summary to make inferences about parameter values

The rstanarm package

Interface to the Stan probabilistic programming language
rstanarm provides high level access to Stan
Allows for custom model definitions

library(rstanarm)

stan_model <- stan_glm(kid_score ~ mom_iq, data = kidiq)

SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
 Gradient evaluation took 0.000408 seconds
 1000 transitions using 10 leapfrog steps per transition would take
 4.08 seconds.
 Adjust your expectations accordingly!

 Iteration:    1 / 2000 [  0%]  (Warmup)
 Iteration:  200 / 2000 [ 10%]  (Warmup)
 Iteration:  400 / 2000 [ 20%]  (Warmup)
 Iteration:  600 / 2000 [ 30%]  (Warmup)
 Iteration:  800 / 2000 [ 40%]  (Warmup)
 Iteration: 1000 / 2000 [ 50%]  (Warmup)
 Iteration: 1001 / 2000 [ 50%]  (Sampling)
 Iteration: 1200 / 2000 [ 60%]  (Sampling)
 Iteration: 1400 / 2000 [ 70%]  (Sampling)
 Iteration: 1600 / 2000 [ 80%]  (Sampling)
 Iteration: 1800 / 2000 [ 90%]  (Sampling)
 Iteration: 2000 / 2000 [100%]  (Sampling)

  Elapsed Time: 0.37735 seconds (Warm-up)
                0.252244 seconds (Sampling)
                0.629594 seconds (Total)

summary(stan_model)

 Model Info:
  function:     stan_glm
  family:       gaussian [identity]
  formula:      kid_score ~ mom_iq
  algorithm:    sampling
  priors:       see help('prior_summary')
  sample:       4000 (posterior sample size)
  observations: 434
  predictors:   2

 Estimates:
                 mean    sd      2.5%    25%     50%     75%     97.5%
 (Intercept)      25.7     6.0    13.8    21.6    25.7    30.0    37.0
 mom_iq            0.6     0.1     0.5     0.6     0.6     0.7     0.7
 sigma            18.3     0.6    17.1    17.9    18.3    18.7    19.5
 mean_PPD         86.8     1.2    84.3    85.9    86.8    87.6    89.2
 log-posterior -1885.4     1.2 -1888.5 -1886.0 -1885.1 -1884.5 -1884.0

 Diagnostics:
               mcse Rhat n_eff
 (Intercept)   0.1  1.0  4000 
 mom_iq        0.0  1.0  4000 
 sigma         0.0  1.0  3827 
 mean_PPD      0.0  1.0  4000 
 log-posterior 0.0  1.0  1896 

 For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor
 on split chains (at convergence Rhat=1).

rstanarm summary: Estimates

 Estimates:
                 mean    sd      2.5%    25%     50%     75%     97.5%
 (Intercept)      25.7     6.0    13.8    21.6    25.7    30.0    37.0
 mom_iq            0.6     0.1     0.5     0.6     0.6     0.7     0.7
 sigma            18.3     0.6    17.1    17.9    18.3    18.7    19.5
 mean_PPD         86.8     1.2    84.3    85.9    86.8    87.6    89.2
 log-posterior -1885.4     1.2 -1888.5 -1886.0 -1885.1 -1884.5 -1884.0

sigma: Standard deviation of errors
mean_PPD: mean of posterior predictive samples
log-posterior: analogous to a likelihood

rstanarm summary: Diagnostics

Diagnostics:
             mcse Rhat n_eff
(Intercept)   0.1  1.0  4000 
mom_iq        0.0  1.0  4000 
sigma         0.0  1.0  3827 
mean_PPD      0.0  1.0  4000 
log-posterior 0.0  1.0  1896 

For each parameter, mcse is Monte Carlo standard error,
n_eff is a crude measure of effective sample size, and
Rhat is the potential scale reduction factor on split chains
 (at convergence Rhat=1).

Rhat: a measure of within chain variance compared to across chain variance
Values less than 1.1 indicate convergence

Let's practice!

Bayesian Regression Modeling with rstanarm