Joining Data with data.table in R
Scott Ritchie
Postdoctoral Researcher in Systems Genomics
What happens when you don't use the correct columns for join keys?
data.table
Using join key columns with different types will error
customers[web_visits, on = .(age = name)]
Error in bmerge(i, x, leftcols, rightcols, io, xo, roll, rollends,
nomatch, :
typeof x.age (double) != typeof i.name (character)
customers[web_visits, on = .(id)]
Error in bmerge(i, x, leftcols, rightcols, io, xo, roll, rollends,
nomatch, :
typeof x.id (integer) != typeof i.id(character)
merge(customers, web_visits, by.x = "address", by.y = "name", all = TRUE)
customers[web_visits, on = .(address = name)]
customers[web_visits, on = .(address = name), nomatch = 0]
customers[web_visits, on = .(age = duration), nomatch = O]
Learning what each column represents before joins will help you avoid errors
merge(customers, web_visits, by.x = "name", by.y = "person")
customers[web_visits, on = .(name = person)] customers[web_visits, on = c("name" = "person")] key <- c("name" = "person") customers[web_visits, on = key]
merge(purchases, web_visits, by = c("name", "date"))
merge(purchases, web_visits,
by.x = c("name", "date"),
by.y = c("person", "date")
purchases[web_visits, on = .(name, date)]
purchases[web_visits, on = c("name", "date")]
purchases[web_visits, on = .(name = person, date)]
purchases[web_visits, on = c("name" = "person", "date")]
Joining Data with data.table in R