Welcome to the course

Optimizing R Code with Rcpp

Romain François

Consulting Datactive, ThinkR

R vs C++

Great, flexible
Slow
Interpreted

C++

Compiled
More difficult
Fast

Motivation

Use Rcpp to make your code faster
No need to know all of C++
Focus on writing simple C++ functions

Course structure

Introduction - basic C++ syntax
C++ functions and control flow
Vector classes
Case studies

Measure performance

loopy version of max

slowmax <- function(x){
   res <- x[1]
   for ( i in 2:length(x) ){
       if( x[i] > res ) res <- x[i]}
   res  }

Comparing performance with the microbenchmark

library(microbenchmark)

x <- rnorm(1e6)
microbenchmark( slowmax(x), max(x) )

Unit: milliseconds
       expr       min       lq      mean    median        uq       max neval
 slowmax(x) 31.649452 34.29454 36.344912 35.435299 37.188249 90.363038   100
     max(x)  1.563559  1.74036  1.939367  1.847045  2.014684  3.340052   100

Writing Efficient R Code

Evaluating simple C++ expressions with evalCpp

library(Rcpp)
evalCpp( "40 + 2" )

evalCpp( "exp(1.0)" )

2.718282

evalCpp( "sqrt(4.0)" )

Using

std::numeric_limits<int>::max()

to get the biggest representable 32 bit signed integer (int)

evalCpp(
   "std::numeric_limits<int>::max()"
   )

2147483647

( $\footnotesize \mathtt{2147483647 = 2^{31}-1}$)

Basic number types

C++ has rich set of number types

Integer numbers: int
Floating point numbers: double

# Literal numbers are double
x <- 42
storage.mode(x)

"double"

# Integers need the L suffix
y <- 42L
storage.mode(y)

z <- as.integer(42)
storage.mode(z)

"integer"

"integer"

C++

# Suffix .0 forces a double
y <- evalCpp( "42.0" )
storage.mode(y)

"double"

library(Rcpp)
# Literal integers are int
x <- evalCpp( "42" )
storage.mode(x)

"integer"

Casting

Explicit casting with (double)

# Explicit cast
y <- evalCpp("(double)(40 + 2)")
storage.mode(y)

"double"

Beware of the integer division

# Integer division
evalCpp( "13 / 4" )

# Explicit cast, and hence use 
# of double division
evalCpp( "(double)13 / 4" )

3.25

# Automatic conversion in R
13L / 4L

3.25

Let's practice!

Optimizing R Code with Rcpp