Pseudo-random number generation

Sampling in R

Richie Cotton

Data Evangelist at DataCamp

What does random mean?

{adjective} made, done, happening, or chosen without method or conscious decision.

  • Oxford Languages
Sampling in R

True random numbers

  • Generated from physical processes, like flipping coins.
  • Hotbits uses radioactive decay.
  • RANDOM.ORG uses atmospheric noise.
    • Available in R via the random package.
  • True randomness is expensive.
1 https://www.fourmilab.ch/hotbits 2 https://www.random.org
Sampling in R

Pseudo-random number generation

  • Next "random" number calculated from previous "random" number.
  • The first "random" number calculated from a seed.
  • If you start from the same seed value, all future random numbers will be the same.
seed <- 1
calc_next_random(seed)
3
calc_next_random(3)
2
calc_next_random(2)
6
Sampling in R

Random number generating functions

function distribution function distribution function distribution
rbeta Beta rgeom Geometric rsignrank Wilcoxon signed rank
rbinom Binomial rhyper Hypergeometric rt t
rcauchy Cauchy rlnorm Lognormal runif Uniform
rchisq Chi-squared rlogis Logistic rweibull Weibull
rexp Exponential rnbinom Negative binomial rwilcox Wilcoxon rank sum
rf F rnorm Normal
rgamma Gamma rpois Poisson
Sampling in R

Visualizing random numbers

rbeta(5000, shape1 = 2, shape2 = 2)
[1] 0.2788 0.7495 0.6485 0.6665 0.6546 0.1575
...

[4996] 0.84719 0.35177 0.92796 0.67603 0.53960
randoms <- data.frame(
  beta = rbeta(5000, shape1 = 2, shape2 = 2)
)
ggplot(randoms, aes(beta)) +
  geom_histogram(binwidth = 0.1)

hist-beta.png

Sampling in R

Random numbers seeds

set.seed(20000229)
rnorm(5)
-1.6538 -0.4028 -0.1654 -0.0734  0.5171
rnorm(5)
1.908  0.379 -1.499  1.625  0.693
set.seed(20000229)
rnorm(5)
-1.6538 -0.4028 -0.1654 -0.0734  0.5171
rnorm(5)
1.908  0.379 -1.499  1.625  0.693
Sampling in R

Using a different seed

set.seed(20000229)
rnorm(5)
-1.6538 -0.4028 -0.1654 -0.0734  0.5171
rnorm(5)
1.908  0.379 -1.499  1.625  0.693
set.seed(20041004)
rnorm(5)
-0.6547 -0.7854 -0.0152  0.1514  0.5285
rnorm(5)
0.748  0.974  0.174 -0.781 -0.930
Sampling in R

Let's practice!

Sampling in R

Preparing Video For Download...