Intermediate Data Visualization with ggplot2
Rick Scavetta
Founder, Scavetta Academy
| Cause of Over-plotting | Solutions | |
|---|---|---|
| 1. | Large datasets | Alpha-blending, hollow circles, point size |
| 2. | Aligned values on a single axis | As above, plus change position |
| 3. | Low-precision data | Position: jitter |
| 4. | Integer data | Position: jitter |
| Cause of Over-plotting | Solutions | Here... | |
|---|---|---|---|
| 1. | Large datasets | Alpha-blending, hollow circles, point size | |
| 2. | Aligned values on a single axis | As above, plus change position | |
| 3. | Low-precision data | Position: jitter | geom_count() |
| 4. | Integer data | Position: jitter | geom_count() |
p <- ggplot(iris, aes(Sepal.Length,
Sepal.Width))
p + geom_point()

p + geom_jitter(alpha = 0.5,
width = 0.1,
height = 0.1)

p +
geom_count()

| geom_ | stat_ |
|---|---|
geom_count() |
stat_sum() |
p +
stat_sum()

ggplot(iris, aes(Sepal.Length,
Sepal.Width,
color = Species)) +
geom_count(alpha = 0.4)

ggplot(iris, aes(Sepal.Length,
Sepal.Width,
color = Species)) +
geom_count(alpha = 0.4)
library(AER)
data(Journals)
p <- ggplot(Journals,
aes(log(price/citations),
log(subs))) +
geom_point(alpha = 0.5) +
labs(...)
p

p +
geom_quantile(quantiles =
c(0.05, 0.50, 0.95))

| geom_ | stat_ |
|---|---|
geom_count() |
stat_sum() |
geom_quantile() |
stat_quantile() |
Intermediate Data Visualization with ggplot2