Scalable Data Processing in R
Simon Urbanek
Member of R-Core, Lead Inventive Scientist, AT&T Labs Research
big.matrix
is stored on the diskThis creates a copy of a
and assigns it to b
.
a <- 42
b <- a
a
42
b
42
a <- 43
a
43
b
42
a <- 42
foo <- function(a){a <- 43 paste("Inside the function a is", a)}
foo(a)
"Inside the function a is 43"
paste("Outside the function a is still", a)
"Outside the function a is still 42"
This function does change the value of a
in the global environment
foo <- function(a) {a$val <- 43
paste("Inside the function a is", a$val)}
a <- environment()
a$val <- 42
foo(a)
"Inside the function a is 43"
paste("Outside the function a$val is", a$val)
"Outside the function a$val is 43"
# x is a big matrix
x <- big.matrix(...)
# x_no_copy and x refer to the same object
x_no_copy <- x
# x_copy and x refer to different objects
x_copy <- deepcopy(x)
R won't make copies implicitly
library(bigmemory)
x <- big.matrix(nrow = 1, ncol = 3, type = "double",
init = 0,
backingfile = "hello-bigmemory.bin",
descriptorfile = "hello-bigmemory.desc")
x_no_copy <- x
x[,]
0 0 0
x_no_copy[,]
0 0 0
x[,] <- 1
x[,]
1 1 1
x_no_copy[,]
1 1 1
x_copy <- deepcopy(x)
x[,]
1 1 1
x_copy[,]
1 1 1
x[,] <- 2
x[,]
2 2 2
x_copy[,]
1 1 1
Scalable Data Processing in R