Fraud Detection in R
Bart Baesens
Professor Data Science at KU Leuven
A dataset satisfies Benford's Law for the first digit if the probability that the first digit $D_1$ equals $d_1$ is approximately: $$P(D_1=d_1)=\log(d_1+1)-\log(d_1)=\log\left(1+\frac{1}{d_1}\right) \qquad d_1=1,\ldots,9$$
Examples
Pinkham discovered that Benford's law is invariant by scaling.
benlaw <- function(d) log10(1 + 1 / d)
benlaw(1)
0.30103
We generate the first 1000 Fibonacci numbers.
fibnum <- numeric(1000)
fibnum[1] <- fibnum[2] <- 1
for (i in 3:1000) { fibnum[i] <- fibnum[i-1] + fibnum[i-2] }
head(fibnum)
1 1 2 3 5 8
We also generate the first 1000 powers of 2.
pow2 <- 2^(1:1000)
head(pow2)
2 4 8 16 32 64
library(benford.analysis)
bfd.fib <- benford(fibnum,
number.of.digits = 1)
plot(bfd.fib)
library(benford.analysis)
bfd.pow2 <- benford(pow2,
number.of.digits = 1)
plot(bfd.pow2)
Fraud Detection in R