Limitations of bigmemory

Scalable Data Processing in R

Michael Kane

Assistant Professor, Yale University

Where can you use bigmemory?

  • You can use bigmemory when your data are
    • matrices
    • dense
    • numeric
  • Underlying data structures are compatible with low-level linear algebra libraries for fast model fitting
  • If you have different column types, you could try the ff package
Scalable Data Processing in R

Understanding disk access

A big.matrix is a data structure designed for random access

Scalable Data Processing in R

Disadvantages of random access

  • Can't add rows or columns to an existing big.matrix object
  • You need to have enough disk space to hold the entire matrix in one big block
Scalable Data Processing in R

Let's practice!

Scalable Data Processing in R

Preparing Video For Download...