Scalable Data Processing in R
Michael Kane
Assistant Professor, Yale University
big.matrix
objectsbiganalytics
bigtabulate
bigalgebra
bigpca
bigFastLM
biglasso
bigrf
library(bigtabulate)
# How many samples do we have per year? bigtable(mort, "year")
2008 2009 2010 2011 2012 2013 2014 2015
8468 11101 8836 7996 10935 10216 5714 6734
# Create nested tables
bigtable(mort, c("msa", "year"))
2008 2009 2010 2011 2012 2013 2014 2015
0 1064 1343 998 851 1066 1005 504 564
1 7404 9758 7838 7145 9869 9211 5210 6170
Scalable Data Processing in R