Writing Efficient R Code
Colin Gillespie
Jumping Rivers & Newcastle University
$$ 1, 2, 3, \ldots, n $$
1:n
seq(1, n)
seq(1, n, by = 1)
colon <- function(n) 1:n
colon(5)
1 2 3 4 5
seq_default <- function(n) seq(1, n)
seq_by <- function(n) seq(1, n, by = 1)
system.time(colon(1e8))
# user system elapsed
# 0.032 0.028 0.060
system.time(seq_default(1e8))
# user system elapsed
# 0.060 0.028 0.086
system.time(seq_by(1e8))
# user system elapsed
# 1.088 0.520 1.600
The trouble with
system.time(colon(1e8))
is we haven't stored the result. We need to rerun to code store the result
res <- colon(1e8)
The <- operator performs both:
system.time(res <- colon(1e8))
The = operator performs one of:
# Raises an error
system.time(res = colon(1e8))
Method | Absolute time (secs) | Relative time |
---|---|---|
colon(n) |
0.060 | $0.060/0.060 = 1.00$ |
seq_default(n) |
0.086 | $0.086/0.060 = 1.40$ |
seq_by(n) |
1.607 | $1.60/0.060 = 26.7$ |
library("microbenchmark")
n <- 1e8 microbenchmark(colon(n), seq_default(n), seq_by(n), times = 10) # Run each function 10 times
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# colon(n) 59 130 220 202 341 391 10 a
# seq_default(n) 94 204 290 337 348 383 10 a
# seq_by(n) 1945 2044 2260 2275 2359 2787 10 b
Writing Efficient R Code