Welcome!

Bayesian Regression Modeling with rstanarm

Jake Thompson

Psychometrician, ATLAS, University of Kansas

Overview

  1. Introduction to Bayesian regression
  2. Customizing Bayesian regression models
  3. Evaluating Bayesian regression models
  4. Presenting and using Bayesian regression models
Bayesian Regression Modeling with rstanarm

A review of frequentist regression

  • Frequentist regression using ordinary least squares
  • The kidiq data
kidiq
# A tibble: 434 x 4
    kid_score mom_hs mom_iq mom_age
        <int>  <int>  <dbl>   <int>
  1        65      1  121.       27
  2        98      1   89.4      25
  3        85      1  115.       27
  4        83      1   99.4      25
  5       115      1   92.7      27
 # ... with 430 more rows
Bayesian Regression Modeling with rstanarm
  • Predict child's IQ score from the mother's IQ score
lm_model <- lm(kid_score ~ mom_iq, data = kidiq)

summary(lm_model)
 Call:
 lm(formula = kid_score ~ mom_iq, data = kidiq)
 Residuals:
     Min      1Q  Median      3Q     Max 
 -56.753 -12.074   2.217  11.710  47.691 
 Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
 (Intercept) 25.79978    5.91741    4.36 1.63e-05 ***
 mom_iq       0.60997    0.05852   10.42  < 2e-16 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 Residual standard error: 18.27 on 432 degrees of freedom
 Multiple R-squared:  0.201,  Adjusted R-squared:  0.1991 
 F-statistic: 108.6 on 1 and 432 DF,  p-value: < 2.2e-16
Bayesian Regression Modeling with rstanarm

Examing model coefficients

  • Use the broom package to focus just on the coefficients
library(broom)

tidy(lm_model)
            term   estimate  std.error statistic      p.value
   1 (Intercept) 25.7997778 5.91741208  4.359977 1.627847e-05
   2      mom_iq  0.6099746 0.05852092 10.423188 7.661950e-23
  • Be cautious about what the p-value actually represents
Bayesian Regression Modeling with rstanarm

Comparing Frequentist and Bayesian probabilities

  • What's the probability a woman has cancer, given positive mammogram?
    • P(+M | C) = 0.9
    • P(C) = 0.004
    • P(+M) = (0.9 x 0.004) + (0.1 x 0.996) = 0.1
  • What is P(C | M+)?
    • 0.036
Bayesian Regression Modeling with rstanarm

Spotify data

songs
 # A tibble: 215 x 7
    track_name    artist_name song_age valence tempo popularity duration_ms
    <chr>         <chr>          <int>   <dbl> <dbl>      <int>       <int>
  1 Crazy In Love Beyoncé         5351   70.1   99.3         72      235933
  2 Naughty Girl  Beyoncé         5351   64.3  100.0         59      208600
  3 Baby Boy      Beyoncé         5351   77.4   91.0         57      244867
  4 Hip Hop Star  Beyoncé         5351   96.8  167.          39      222533
  5 Be With You   Beyoncé         5351   75.6   74.9         42      260160
  6 Me, Myself a… Beyoncé         5351   55.5   83.6         54      301173
  7 Yes           Beyoncé         5351   56.2  112.          43      259093
  8 Signs         Beyoncé         5351   39.8   74.3         41      298533
  9 Speechless    Beyoncé         5351    9.92 113.          41      360440
 # ... with 206 more rows
Bayesian Regression Modeling with rstanarm

Let's practice!

Bayesian Regression Modeling with rstanarm

Preparing Video For Download...