Joining Data with dplyr
Chris Cardillo
Data Scientist
answers
# A tibble: 380,643 x 4
id creation_date question_id score
<int> <date> <int> <int>
1 39143713 2016-08-25 39143518 3
2 39143869 2016-08-25 39143518 1
3 39143935 2016-08-25 39142481 0
4 39144014 2016-08-25 39024390 0
5 39144252 2016-08-25 39096741 6
6 39144375 2016-08-25 39143885 5
7 39144430 2016-08-25 39144077 0
8 39144625 2016-08-25 39142728 1
9 39144794 2016-08-25 39043648 0
10 39145033 2016-08-25 39133170 1
# … with 380,633 more rows
questions %>%
inner_join(answers, by = c("id" = "question_id"))
# A tibble: 380,643 x 6
id creation_date.x score.x id.y creation_date.y score.y
<int> <date> <int> <int> <date> <int>
1 22557677 2014-03-21 1 22560670 2014-03-21 2
2 22557707 2014-03-21 2 22558516 2014-03-21 1
3 22557707 2014-03-21 2 22558726 2014-03-21 4
4 22558084 2014-03-21 2 22558085 2014-03-21 0
5 22558084 2014-03-21 2 22606545 2014-03-24 1
6 22558084 2014-03-21 2 22610396 2014-03-24 5
7 22558084 2014-03-21 2 34374729 2015-12-19 0
8 22558395 2014-03-21 2 22559327 2014-03-21 1
9 22558395 2014-03-21 2 22560102 2014-03-21 2
10 22558395 2014-03-21 2 22560288 2014-03-21 2
# … with 380,633 more rows
Joining Data with dplyr