Introduction to linear regression

A/B Testing in R

Lauryn Burleigh

Data Scientist

Regression

  • A/B design - time to eat Cheese or Pepperoni pizza
  • Regression - factors that impact eating time
    • Dependent variable - time to eat
    • Independent variables - enjoyment, hunger, visual appeal

Positive correlation of points with time to eat pizza on the y-axis and enjoyment on the x-axis.

A/B Testing in R

Regression line

  • Regression line - prediction of y

Positive correlation of points with time to eat pizza on the y-axis and enjoyment on the x-axis with a red line of best fit.

A/B Testing in R

Regression line

  • Regression line - prediction of y
  • ŷ = β₀ + β₁X₁ + ε
    • β₀ - y-intercept

Positive correlation of points with time to eat pizza on the y-axis and enjoyment on the x-axis with a red line of best fit and purple indication of the y-intercept.

A/B Testing in R

Regression line

  • Regression line - prediction of y
  • ŷ = β₀ + β₁X₁ + ε
    • β₀ - y-intercept
    • β₁ - slope

Positive correlation of points with time to eat pizza on the y-axis and enjoyment on the x-axis with a red line of best fit and purple indication of the y-intercept and indication of the slope.

A/B Testing in R

Regression line

  • Regression line - prediction of y
  • ŷ = β₀ + β₁X₁ + ε

    • β₀ - y-intercept
    • β₁ - slope
    • ε - error
  • Reduce error

    • Multiple regression - 2+ independent variables
    • ŷ = β₀ + β₁X₁ + β₂X₂ + ε

Positive correlation of points with time to eat pizza on the y-axis and enjoyment on the x-axis with a red line of best fit and purple indication of the y-intercept, indication of the slope, and indication of a residual.

A/B Testing in R

Predicting data

yhat <- 5.32 + 0.08*(15)
ggplot(pizza, aes(x = Enjoy, 
                  y = Time)) + 
  geom_point() + 
  geom_hline(yintercept = yhat) + 
  geom_vline(xintercept = 15)

Positive correlation of points with time to eat pizza on the y-axis and enjoyment on the x-axis with a horizontal line at time of 6.5 and vertical line at enjoyment of 15.

A/B Testing in R

Regression considerations

  • Correlation does not imply causation
  • Assess reasonable variables
  • Decisions/actions data lead to
  • Quality data

Individual looking at multiple charts regrading data analyses.

A/B Testing in R

Let's practice!

A/B Testing in R

Preparing Video For Download...