Lines that intersect are without parallel

Programming with dplyr

Dr. Chester Ismay

Educator, Data Scientist, and R/Python Consultant

Set theory clauses

  • Compare and combine data from two sources

  • dplyr has several functions to perform set theory clauses on tibbles

Programming with dplyr

Venn diagrams for set theory

Intersect Venn

Union Venn

Union All Venn

Setdiff Venn

Programming with dplyr

intersect diagram

intersect diagram

Programming with dplyr

Uruguay tibbles

uruguay_imf
# A tibble: 9 x 4
  iso   country  year consumer_price_index
  <chr> <chr>   <int>                <dbl>
1 URY   Uruguay  2011                 105.
2 URY   Uruguay  2012                 114.
3 URY   Uruguay  2013                 123.
4 URY   Uruguay  2014                 134.
5 URY   Uruguay  2015                 146.
6 URY   Uruguay  2016                 160.
7 URY   Uruguay  2017                 170.
8 URY   Uruguay  2018                 183.
9 URY   Uruguay  2019                 197.
uruguay_wb
# A tibble: 4 x 4
  iso   country  year perc_rural_pop
  <chr> <chr>   <dbl>          <dbl>
1 URY   Uruguay  2013           5.16
2 URY   Uruguay  2014           5.06
3 URY   Uruguay  2015           4.96
4 URY   Uruguay  2016           4.86
Programming with dplyr

Trying out intersect()

intersect(uruguay_imf, uruguay_wb)
Error: not compatible: 
not compatible: 
- Cols in y but not x: `perc_rural_pop`.
- Cols in x but not y: `consumer_price_index`.
intersect(uruguay_imf$year, uruguay_wb$year)
[1] 2013 2014 2015 2016
Programming with dplyr

Difference between intersect() and a join

  • intersect() looks for rows in common
  • inner_join() looks for individual key entries matching

This is an important distinction.

Programming with dplyr

Let's practice!

Programming with dplyr

Preparing Video For Download...