Programming with dplyr
Dr. Chester Ismay
Educator, Data Scientist, and R/Python Consultant
Compare and combine data from two sources
dplyr
has several functions to perform set theory clauses on tibbles
uruguay_imf
# A tibble: 9 x 4
iso country year consumer_price_index
<chr> <chr> <int> <dbl>
1 URY Uruguay 2011 105.
2 URY Uruguay 2012 114.
3 URY Uruguay 2013 123.
4 URY Uruguay 2014 134.
5 URY Uruguay 2015 146.
6 URY Uruguay 2016 160.
7 URY Uruguay 2017 170.
8 URY Uruguay 2018 183.
9 URY Uruguay 2019 197.
uruguay_wb
# A tibble: 4 x 4
iso country year perc_rural_pop
<chr> <chr> <dbl> <dbl>
1 URY Uruguay 2013 5.16
2 URY Uruguay 2014 5.06
3 URY Uruguay 2015 4.96
4 URY Uruguay 2016 4.86
intersect(uruguay_imf, uruguay_wb)
Error: not compatible:
not compatible:
- Cols in y but not x: `perc_rural_pop`.
- Cols in x but not y: `consumer_price_index`.
intersect(uruguay_imf$year, uruguay_wb$year)
[1] 2013 2014 2015 2016
intersect()
looks for rows in commoninner_join()
looks for individual key entries matchingThis is an important distinction.
Programming with dplyr