Introduction to Statistics in R
Maggie Matsui
Content Developer, DataCamp

$$r = 0.18$$
What we see:

What the correlation coefficient sees:

Correlation shouldn't be used blindly
cor(df$x, df$y)
0.1786163
Always visualize your data

msleep
   name                       vore  sleep_total awake  bodywt
 1 Cheetah                    carni        12.1  11.9  50    
 2 Owl monkey                 omni         17     7     0.48 
 3 Mountain beaver            herbi        14.4   9.6   1.35 
 4 Greater short-tailed shrew omni         14.9   9.1   0.019
 5 Cow                        herbi         4    20   600    
 6 Three-toed sloth           herbi        14.4   9.6   3.85 
 ... 

cor(msleep$bodywt, msleep$awake)
0.3119801

msleep %>% mutate(log_bodywt = log(bodywt)) %>%ggplot(aes(log_bodywt, awake)) + geom_point() + geom_smooth(method = "lm", se = FALSE)
cor(msleep$log_bodywt, msleep$awake)
0.5687943

log(x))sqrt(x))Reciprocal transformation (1 / x)
Combinations of these, e.g.:
log(x) and log(y)sqrt(x) and 1 / yx is correlated with y does not mean  x causes y

 

 

 

 

 

 

Introduction to Statistics in R