Joining Data with data.table in R
Scott Ritchie
Postdoctoral Researcher in Systems Genomics
sales_wide <- dcast(sales_long, quarter ~ year, value.var = "amount")
The general form of dcast()
:
dcast(DT, ids ~ group, value.var = "values")
| | | |
| | | --> column to split
| | ----------------------> group labels to split by
| ----------------------------> rows to keep behind as identifiers
--------------------------------> data.table to reshape
sales_wide <- dcast(sales_long, quarter ~ year, value.var = "amount")
dcast(profit_long, quarter ~ year, value.var = c("revenue", "profit"))
Keep multiple columns as row identifiers:
dcast(sales_long, quarter + season ~ year, value.var = "amount")
Only columns included in the formula or value.var
will be in the result:
sales_wide <- dcast(sales_long, quarter ~ year, value.var = "amount")
Split on multiple group columns:
dcast(sales_long, quarter ~ department + year, value.var = "amount")
sales_wide <- dcast(sales_long, season ~ year, value.var = "amount")
sales_wide
season 2015 2016
1: Autumn 3420000 3670000
2: Spring 2950000 3000300
3: Summer 2980700 3120200
4: Winter 3200100 3350000
as.matrix()
can take one of the columns to use as the matrix rownames:
mat <- as.matrix(sales_wide, rownames = "season")
mat
2015 2016
Autumn 3420000 3670000
Spring 2950000 3000300
Summer 2980700 3120200
Winter 3200100 3350000
Joining Data with data.table in R