Stats with geoms

Intermediate Data Visualization with ggplot2

Rick Scavetta

Founder, Scavetta Academy

ggplot2, course 2

  • Statistics
  • Coordinates
  • Facets
  • Data Visualization Best Practices
Intermediate Data Visualization with ggplot2

Statistics layer

  • Two categories of functions
    • Called from within a geom
    • Called independently
  • stats_
Intermediate Data Visualization with ggplot2

geom_ <-> stat_

p <- ggplot(iris, aes(x = Sepal.Width))
p + geom_histogram()

geomhistogram

Intermediate Data Visualization with ggplot2

geom_ <-> stat_

p <- ggplot(iris, aes(x = Sepal.Width))
p + geom_histogram()
p + geom_bar()

geomhistogram

Intermediate Data Visualization with ggplot2

geom_ <-> stat_

p <- ggplot(mtcars, aes(x = factor(cyl),  fill = factor(am))) 
p + geom_bar()
p + stat_count()

Intermediate Data Visualization with ggplot2

The geom_/stat_ connection

stat_ geom_
stat_bin() geom_histogram(), geom_freqpoly()
stat_count() geom_bar()
Intermediate Data Visualization with ggplot2

stat_smooth()

ggplot(iris, aes(x = Sepal.Length, 
                 y = Sepal.Width, 
                 color = Species)) + 
  geom_point() +
  geom_smooth()
geom_smooth() using method = 'loess' and 
formula 'y ~ x'

Intermediate Data Visualization with ggplot2

stat_smooth(se = FALSE)

ggplot(iris, aes(x = Sepal.Length, 
                 y = Sepal.Width, 
                 color = Species)) + 
  geom_point() +
  geom_smooth(se = FALSE)
geom_smooth() using method = 'loess' and 
formula 'y ~ x'

Intermediate Data Visualization with ggplot2

geom_smooth(span = 0.4)

ggplot(iris, aes(x = Sepal.Length, 
                 y = Sepal.Width, 
                 color = Species)) + 
  geom_point() +
  geom_smooth(se = FALSE, span = 0.4)
geom_smooth() using method = 'loess' and 
formula 'y ~ x'

Intermediate Data Visualization with ggplot2

geom_smooth(method = "lm")

ggplot(iris, aes(x = Sepal.Length, 
                 y = Sepal.Width, 
                 color = Species)) + 
  geom_point() +
  geom_smooth(method = "lm", se = FALSE)

Intermediate Data Visualization with ggplot2

geom_smooth(fullrange = TRUE)

ggplot(iris, aes(x = Sepal.Length, 
                 y = Sepal.Width, 
                 color = Species)) + 
  geom_point() +
  geom_smooth(method = "lm", 
              fullrange = TRUE)

Intermediate Data Visualization with ggplot2

The geom_/stat_ connection

stat_ geom_
stat_bin() geom_histogram(), geom_freqpoly()
stat_count() geom_bar()
stat_smooth() geom_smooth()
Intermediate Data Visualization with ggplot2

Other stat_ functions

stat_ geom_
stat_boxplot() geom_boxplot()
Intermediate Data Visualization with ggplot2

Other stat_ functions

stat_ geom_
stat_boxplot() geom_boxplot()
stat_bindot() geom_dotplot()
stat_bin2d() geom_bin2d()
stat_binhex() geom_hex()
Intermediate Data Visualization with ggplot2

Other stat_ functions

stat_ geom_
stat_boxplot() geom_boxplot()
stat_bindot() geom_dotplot()
stat_bin2d() geom_bin2d()
stat_binhex() geom_hex()
stat_contour() geom_contour()
stat_quantile() geom_quantile()
stat_sum() geom_count()
Intermediate Data Visualization with ggplot2

Let's practice!

Intermediate Data Visualization with ggplot2

Preparing Video For Download...