Scalable Data Processing in R
Michael J. Kane and Simon Urbanek
Instructors, DataCamp
All R objects are stored in RAM
"R is not well-suited for working with data larger than 10-20% of a computer's RAM." - The R Installation and Administration Manual
Complexity of calculations
Carefully consider disk operations to write fast, scalable code
library(microbenchmark)
microbenchmark( rnorm(100), rnorm(10000) )
Unit: microseconds
expr min lq mean median uq max neval
rnorm(100) 7.84 8.440 9.5459 8.773 9.355 29.56 100
rnorm(10000) 679.51 683.706 755.5693 690.876 712.416 2949.03 100
Scalable Data Processing in R