Supervised Learning in R: Regression
Nina Zumel and John Mount
Win-Vector LLC
A measure of how well the model fits or explains the data
$R^2$ is the variance explained by the model.
$$ R^2 = 1 - \frac{RSS}{SS_{Tot}} $$
where
err <- houseprices$prediction - houseprices$price
rss <- sum(err^2)
price
: column of actual sale prices (in thousands)pred
: column of predicted sale prices (in thousands)toterr <- houseprices$price - mean(houseprices$price)
sstot <- sum(toterr^2)
(r_squared <- 1 - (rss/sstot) )
0.8092278
# From summary()
summary(hmodel)
...
Residual standard error: 60.66 on 37 degrees of freedom
Multiple R-squared: 0.8092, Adjusted R-squared: 0.7989
F-statistic: 78.47 on 2 and 37 DF, p-value: 4.893e-14
summary(hmodel)$r.squared
0.8092278
# From glance()
glance(hmodel)$r.squared
0.8092278
rho <- cor(houseprices$prediction, houseprices$price)
0.8995709
rho^2
0.8092278
cor(prediction, price)
= 0.8995709Supervised Learning in R: Regression