Join together for fun

Programming with dplyr

Dr. Chester Ismay

Educator, Data Scientist, and R/Python Consultant

dplyr join diagrams

Left join

Left join diagram

Inner join

Inner join diagram

Anti join

Anti join diagram

Programming with dplyr

Some IMF data for Uruguay

uruguay_imf <- imf_data %>%
  select(iso, 
         country, 
         year, 
         consumer_price_index) %>%
  filter(country == "Uruguay", year > 2010)
uruguay_imf
# A tibble: 9 x 4
  iso   country  year consumer_price_index
  <chr> <chr>   <int>                <dbl>
1 URY   Uruguay  2011                 105.
2 URY   Uruguay  2012                 114.
3 URY   Uruguay  2013                 123.
4 URY   Uruguay  2014                 134.
5 URY   Uruguay  2015                 146.
6 URY   Uruguay  2016                 160.
7 URY   Uruguay  2017                 170.
8 URY   Uruguay  2018                 183.
9 URY   Uruguay  2019                 197.
Programming with dplyr

Some World Bank data for Uruguay

uruguay_wb <- world_bank_data %>%
  select(iso, country, year, perc_rural_pop) %>%
  filter(country == "Uruguay")
uruguay_wb
# A tibble: 4 x 4
  iso   country  year perc_rural_pop
  <chr> <chr>   <dbl>          <dbl>
1 URY   Uruguay  2013           5.16
2 URY   Uruguay  2014           5.06
3 URY   Uruguay  2015           4.96
4 URY   Uruguay  2016           4.86
Programming with dplyr
uruguay_imf %>% 
    left_join(uruguay_wb)
Joining, by = c("iso", "country", "year")
# A tibble: 9 x 5
  iso   country  year consumer_price_index perc_rural_pop
  <chr> <chr>   <dbl>                <dbl>          <dbl>
1 URY   Uruguay  2011                 105.          NA   
2 URY   Uruguay  2012                 114.          NA   
3 URY   Uruguay  2013                 123.           5.16
4 URY   Uruguay  2014                 134.           5.06
5 URY   Uruguay  2015                 146.           4.96
6 URY   Uruguay  2016                 160.           4.86
7 URY   Uruguay  2017                 170.          NA   
8 URY   Uruguay  2018                 183.          NA   
9 URY   Uruguay  2019                 197.          NA
Programming with dplyr

Inner join on Uruguayan tibbles

uruguay_imf %>%
    inner_join(uruguay_wb,
               by = c("iso", "country", "year"))
# A tibble: 4 x 5
  iso   country  year consumer_price_index perc_rural_pop
  <chr> <chr>   <dbl>                <dbl>          <dbl>
1 URY   Uruguay  2013                 123.           5.16
2 URY   Uruguay  2014                 134.           5.06
3 URY   Uruguay  2015                 146.           4.96
4 URY   Uruguay  2016                 160.           4.86
Programming with dplyr

Anti join on Uruguayan tibbles

uruguay_imf %>%
    anti_join(uruguay_wb,
              by = c("iso", "country", "year"))
# A tibble: 5 x 4
  iso   country  year consumer_price_index
  <chr> <chr>   <int>                <dbl>
1 URY   Uruguay  2011                 105.
2 URY   Uruguay  2012                 114.
3 URY   Uruguay  2017                 170.
4 URY   Uruguay  2018                 183.
5 URY   Uruguay  2019                 197.
Programming with dplyr

Let's practice!

Programming with dplyr

Preparing Video For Download...