Introduction to Statistics in R
Maggie Matsui
Content Developer, DataCamp
x
doesn't tell us anything about y
x
increases, y
increasesx
increases, y
decreasesggplot(df, aes(x, y)) +
geom_point()
ggplot(df, aes(x, y)) + geom_point() +
geom_smooth(method = "lm", se = FALSE)
cor(df$x, df$y)
-0.7472765
cor(df$y, df$x)
-0.7472765
df$x
-3.2508382 -9.1599807 3.4515013 4.1505899 NA 11.9806140 ...
cor(df$x, df$y)
NA
cor(df$x, df$y, use = "pairwise.complete.obs")
-0.7471757
$$ r =\frac{\sum ^n _{i=1}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum ^n _{i=1}(x_i - \bar{x})^2} \sqrt{\sum ^n _{i=1}(y_i - \bar{y})^2}} $$
Introduction to Statistics in R