The summarize verb

Introduction to the Tidyverse

David Robinson

Chief Data Scientist, DataCamp

Data transformation and visualization

Introduction to the Tidyverse

Extracting data

gapminder %>%
  filter(country == "United States", year == 2007)
# A tibble: 1 x 6
        country continent  year lifeExp       pop gdpPercap
          <fct>     <fct> <int>   <dbl>     <dbl>     <dbl>
1 United States  Americas  2007  78.242 301139947  42951.65
Introduction to the Tidyverse

The summarize verb

gapminder %>%
  summarize(meanLifeExp = mean(lifeExp))
# A tibble: 1 x 1
  meanLifeExp
        <dbl>
1    59.47444
Introduction to the Tidyverse

Summarizing one year

gapminder %>%
  filter(year == 2007) %>%
  summarize(meanLifeExp = mean(lifeExp))
# A tibble: 1 x 1
  meanLifeExp
        <dbl>
1    67.00742
Introduction to the Tidyverse

Summarizing into multiple columns

gapminder %>%
  filter(year == 2007) %>%
  summarize(meanLifeExp = mean(lifeExp),
            totalPop = sum(pop))
# A tibble: 1 x 2
  meanLifeExp   totalPop
        <dbl>      <dbl>
1    67.00742 6251013179
Introduction to the Tidyverse

Functions you can use for summarizing

  • mean
  • sum
  • median
  • min
  • max
Introduction to the Tidyverse

Let's practice!

Introduction to the Tidyverse

Preparing Video For Download...