Joining Data with data.table in R
Scott Ritchie
Postdoctoral Researcher in Systems Genomics
Concept of joins come from database query languages (e.g. SQL).
Four standard joins:
All four can be done using merge()
Only keep observations that have information in both data.tables
merge(x = demographics, y = shipping,
by.x = "name", by.y = "name")
Use by
to avoid repeated typing of the same column name
merge(x = demographics, y = shipping,
by = "name")
Keep all observations that are in either data.table
merge(x = demographics, y = shipping,
by = "name", all = TRUE)
Joining Data with data.table in R