Assessing model fit with R-squared

Modellazione con i dati nel Tidyverse

Albert Y. Kim

Assistant Professor of Statistical and Data Sciences

R-squared

$R^2 = 1 - \frac{\text{Var}(\text{residuals})}{\text{Var}(y)}$

  • $R^2$ is between 0 & 1
  • Smaller $R^2$ ~ "poorer fit"
  • $R^2 = 1$ ~ "perfect fit" and $R^2 = 0$ ~ "no fit"
Modellazione con i dati nel Tidyverse

High R-squared value example

$R^2 = 1 - \frac{\text{Var}(\text{residuals})}{\text{Var}(y)}$

Modellazione con i dati nel Tidyverse

High R-squared value: "Perfect" fit

$R^2 = 1 - \frac{\text{Var}(\text{residuals})}{\text{Var}(y)}$

Modellazione con i dati nel Tidyverse

Low R-squared value example

$R^2 = 1 - \frac{\text{Var}(\text{residuals})}{\text{Var}(y)}$

Modellazione con i dati nel Tidyverse

Low R-squared value example

$R^2 = 1 - \frac{\text{Var}(\text{residuals})}{\text{Var}(y)}$

Modellazione con i dati nel Tidyverse

Numerical interpretation

$$

Since $\text{Var}(y) \geq \text{Var}(\text{residuals})$ and $$

$R^2 = 1 - \frac{\text{Var}(\text{residuals})}{\text{Var}(y)} = \frac{\text{Var}(y) - \text{Var}(\text{residuals})}{\text{Var}(y)}$

$$

$R^2$'s interpretation is: the proportion of the total variation in the outcome variable $y$ that the model explains.

Modellazione con i dati nel Tidyverse

Computing R-squared

# Model 1: price as a function of size and year built
model_price_1 <- lm(log10_price ~ log10_size + yr_built,
                    data = house_prices)

get_regression_points(model_price_1) %>%
  summarize(r_squared = 1 - var(residual)/var(log10_price))
# A tibble: 1 x 1
  r_squared
      <dbl>
1     0.483
Modellazione con i dati nel Tidyverse

Computing R-squared

# Model 3: price as a function of size and condition
model_price_3 <- lm(log10_price ~ log10_size + condition,
                    data = house_prices)

get_regression_points(model_price_3) %>%
  summarize(r_squared = 1 - var(residual)/var(log10_price))
# A tibble: 1 x 1
  r_squared
      <dbl>
1     0.462
Modellazione con i dati nel Tidyverse

Let's practice!

Modellazione con i dati nel Tidyverse

Preparing Video For Download...