Joining Data with data.table in R
Scott Ritchie
Postdoctoral Researcher in Systems Genomics
Chapter 1: Joining data with merge()
Chapter 2: Joins in the data.table
workflow
Chapter 3: Troubleshooting joins
Chapter 4: Concatenating and reshaping data.table
s
Columns that link information across two tables
library(data.table)
demographics <- data.table(name = c("Trey", "Matthew", "Angela"), gender = c(NA, "M", "F"), age = c(54, 43, 39)) shipping <- data.table(name = c("Matthew", "Trey", "Angela"), address = c("7 Mill road", "12 High street", "33 Pacific boulevard"))
The tables()
function will show you all data.tables
loaded in your R session
tables()
NAME NROW NCOL MB COLS KEY
1: demographics 3 3 0 name,gender,age
2: shipping 3 2 0 name,address
Total: 0MB
The str()
will show you the type of each column in a single data.table
str(demographics)
Classes ‘data.table’ and 'data.frame': 3 obs. of 3 variables:
$ name : chr "Trey" "Matthew" "Angela"
$ gender: chr NA "M" "F"
$ age : num 54 43 39
- attr(*, ".internal.selfref")=<externalptr>
demographics_all
name sex age
1: Trey NA 54
2: Matthew M 43
3: Angela F 39
4: Michelle F 63
5: Mohamed M 26
---
102: Patrick M 27
103: Wei F 41
104: Adam M 33
105: Somchai M 53
106: Alma F 19
Joining Data with data.table in R