Modeling with Data in the Tidyverse
Albert Y. Kim
Assistant Professor of Statistical and Data Sciences
$$ y = f(\vec{x}) + \epsilon $$
Where:
Consider $y = f(\vec{x}) + \epsilon$.
library(ggplot2)
library(dplyr)
library(moderndive)
ggplot(evals, aes(x = age, y = score)) +
geom_point() +
labs(x = "age", y = "score",
title = "Teaching score over age")
library(ggplot2)
library(dplyr)
library(moderndive)
# Use geom_jitter() instead of geom_point()
ggplot(evals, aes(x = age, y = score)) +
geom_jitter() +
labs(x = "age", y = "score",
title = "Teaching score over age (jittered)")
evals %>%
summarize(correlation = cor(score, age))
# A tibble: 1 x 1
correlation
<dbl>
1 -0.107
Modeling with Data in the Tidyverse