Joining three or more tables

Joining Data with dplyr

Chris Cardillo

Data Scientist

Joining sets and inventories

sets %>%
  inner_join(inventories, by = "set_num")
# A tibble: 5,056 x 6
   set_num   name                                         year theme_id    id version
   <chr>     <chr>                                       <dbl>    <dbl> <dbl>   <dbl>
 1 700.3-1   Medium Gift Set (ABB)                        1949      365 24197       1
 2 700.3-1   Medium Gift Set (ABB)                        1949      365 24214       2
 3 700.3-1   Medium Gift Set (ABB)                        1949      365 24215       3
 4 700.1.1-1 Single 2 x 4 Brick (ABB)                     1950      371 11831       1
 5 700.1.1-1 Single 2 x 4 Brick (ABB)                     1950      371 24230       2
 6 700.1.1-1 Single 2 x 4 Brick (ABB)                     1950      371 24231       3
 7 700.1.1-1 Single 2 x 4 Brick (ABB)                     1950      371 24232       4
 8 700.1.1-1 Single 2 x 4 Brick (ABB)                     1950      371 24233       5
 9 700.B.2-1 Single 1 x 2 x 3 Window without Glass (ABB)  1950      371   537       1
10 700.B.2-1 Single 1 x 2 x 3 Window without Glass (ABB)  1950      371 24240       2
# … with 5,046 more rows
Joining Data with dplyr

The themes table

themes
# A tibble: 665 x 3
      id name           parent_id
   <dbl> <chr>              <dbl>
 1     1 Technic               NA
 2     2 Arctic Technic         1
 3     3 Competition            1
 4     4 Expert Builder         1
 5     5 Model                  1
 6     6 Airport                5
 7     7 Construction           5
 8     8 Farm                   5
 9     9 Fire                   5
10    10 Harbor                 5
# … with 655 more rows
Joining Data with dplyr

Adding another join

sets %>%
  inner_join(inventories, by = "set_num") %>%
  inner_join(themes, by = c("theme_id" = "id"))
# A tibble: 5,056 x 8
   set_num   name.x                             year theme_id    id version name.y     parent_id
   <chr>     <chr>                             <dbl>    <dbl> <dbl>   <dbl> <chr>          <dbl>
 1 700.3-1   Medium Gift Set (ABB)              1949      365 24197       1 System            NA
 2 700.3-1   Medium Gift Set (ABB)              1949      365 24214       2 System            NA
 3 700.3-1   Medium Gift Set (ABB)              1949      365 24215       3 System            NA
 4 700.1.1-1 Single 2 x 4 Brick (ABB)           1950      371 11831       1 Supplemen…       365
 5 700.1.1-1 Single 2 x 4 Brick (ABB)           1950      371 24230       2 Supplemen…       365
 6 700.1.1-1 Single 2 x 4 Brick (ABB)           1950      371 24231       3 Supplemen…       365
 7 700.1.1-1 Single 2 x 4 Brick (ABB)           1950      371 24232       4 Supplemen…       365
 8 700.1.1-1 Single 2 x 4 Brick (ABB)           1950      371 24233       5 Supplemen…       365
 9 700.B.2-1 Single 1 x 2 x 3 Window without …  1950      371   537       1 Supplemen…       365
10 700.B.2-1 Single 1 x 2 x 3 Window without …  1950      371 24240       2 Supplemen…       365
# … with 5,046 more rows
Joining Data with dplyr

Recall: suffix

sets %>%
  inner_join(inventories, by = "set_num") %>% 
  inner_join(themes, by = c("theme_id" = "id"), suffix = c("_set", "_theme"))
# A tibble: 5,056 x 8
   set_num   name_set                          year theme_id    id version name_theme  parent_id
   <chr>     <chr>                            <dbl>    <dbl> <dbl>   <dbl> <chr>           <dbl>
 1 700.3-1   Medium Gift Set (ABB)             1949      365 24197       1 System             NA
 2 700.3-1   Medium Gift Set (ABB)             1949      365 24214       2 System             NA
 3 700.3-1   Medium Gift Set (ABB)             1949      365 24215       3 System             NA
 4 700.1.1-1 Single 2 x 4 Brick (ABB)          1950      371 11831       1 Supplement…       365
 5 700.1.1-1 Single 2 x 4 Brick (ABB)          1950      371 24230       2 Supplement…       365
 6 700.1.1-1 Single 2 x 4 Brick (ABB)          1950      371 24231       3 Supplement…       365
 7 700.1.1-1 Single 2 x 4 Brick (ABB)          1950      371 24232       4 Supplement…       365
 8 700.1.1-1 Single 2 x 4 Brick (ABB)          1950      371 24233       5 Supplement…       365
 9 700.B.2-1 Single 1 x 2 x 3 Window without…  1950      371   537       1 Supplement…       365
10 700.B.2-1 Single 1 x 2 x 3 Window without…  1950      371 24240       2 Supplement…       365
# … with 5,046 more rows
Joining Data with dplyr

LEGO-tables

Joining Data with dplyr

Let's practice!

Joining Data with dplyr

Preparing Video For Download...