Joining Data with dplyr
Chris Cardillo
Data Scientist
sets %>%
inner_join(inventories, by = "set_num")
# A tibble: 5,056 x 6
set_num name year theme_id id version
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 700.3-1 Medium Gift Set (ABB) 1949 365 24197 1
2 700.3-1 Medium Gift Set (ABB) 1949 365 24214 2
3 700.3-1 Medium Gift Set (ABB) 1949 365 24215 3
4 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 11831 1
5 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24230 2
6 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24231 3
7 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24232 4
8 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24233 5
9 700.B.2-1 Single 1 x 2 x 3 Window without Glass (ABB) 1950 371 537 1
10 700.B.2-1 Single 1 x 2 x 3 Window without Glass (ABB) 1950 371 24240 2
# … with 5,046 more rows
themes
# A tibble: 665 x 3
id name parent_id
<dbl> <chr> <dbl>
1 1 Technic NA
2 2 Arctic Technic 1
3 3 Competition 1
4 4 Expert Builder 1
5 5 Model 1
6 6 Airport 5
7 7 Construction 5
8 8 Farm 5
9 9 Fire 5
10 10 Harbor 5
# … with 655 more rows
sets %>%
inner_join(inventories, by = "set_num") %>%
inner_join(themes, by = c("theme_id" = "id"))
# A tibble: 5,056 x 8
set_num name.x year theme_id id version name.y parent_id
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
1 700.3-1 Medium Gift Set (ABB) 1949 365 24197 1 System NA
2 700.3-1 Medium Gift Set (ABB) 1949 365 24214 2 System NA
3 700.3-1 Medium Gift Set (ABB) 1949 365 24215 3 System NA
4 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 11831 1 Supplemen… 365
5 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24230 2 Supplemen… 365
6 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24231 3 Supplemen… 365
7 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24232 4 Supplemen… 365
8 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24233 5 Supplemen… 365
9 700.B.2-1 Single 1 x 2 x 3 Window without … 1950 371 537 1 Supplemen… 365
10 700.B.2-1 Single 1 x 2 x 3 Window without … 1950 371 24240 2 Supplemen… 365
# … with 5,046 more rows
sets %>%
inner_join(inventories, by = "set_num") %>%
inner_join(themes, by = c("theme_id" = "id"), suffix = c("_set", "_theme"))
# A tibble: 5,056 x 8
set_num name_set year theme_id id version name_theme parent_id
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
1 700.3-1 Medium Gift Set (ABB) 1949 365 24197 1 System NA
2 700.3-1 Medium Gift Set (ABB) 1949 365 24214 2 System NA
3 700.3-1 Medium Gift Set (ABB) 1949 365 24215 3 System NA
4 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 11831 1 Supplement… 365
5 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24230 2 Supplement… 365
6 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24231 3 Supplement… 365
7 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24232 4 Supplement… 365
8 700.1.1-1 Single 2 x 4 Brick (ABB) 1950 371 24233 5 Supplement… 365
9 700.B.2-1 Single 1 x 2 x 3 Window without… 1950 371 537 1 Supplement… 365
10 700.B.2-1 Single 1 x 2 x 3 Window without… 1950 371 24240 2 Supplement… 365
# … with 5,046 more rows
Joining Data with dplyr