Joining Data with data.table in R
Scott Ritchie
Postdoctoral Researcher in Systems Genomics
Given two data.tables
with the same columns:
fintersect()
: what rows do these two data.tables
share in common?funion()
: what is the unique set of rows across these two data.tables
?
fsetdiff()
: what rows are unique to this data.table
?
Extract rows that are present in both data.tables
fintersect(dt1, dt2)
Duplicate rows are ignored by default:
fintersect(dt1, dt2)
all = TRUE
: keep the number of copies present in both data.tables
:
fintersect(dt1, dt2, all = TRUE)
Extract rows found exclusively in the first data.table
fsetdiff(dt1, dt2)
Duplicate rows are ignored by default:
fsetdiff(dt1, dt2)
all = TRUE
: return all extra copies:
fsetdiff(dt1, dt2, all = TRUE)
Extract all rows found in either data.table
:
funion(dt1, dt2)
Duplicate rows are ignored by default:
funion(dt1, dt2)
all = TRUE
: return all rows:
funion(dt1, dt2, all = TRUE) # rbind()
Two data.tables
:
funion()
to concatenate unique rowsThree or more:
data.tables
using rbind()
or rbindlist()
duplicated()
and unique()
Joining Data with data.table in R