Fraud Detection in R
Tim Verdonck
Professor Data Science at KU Leuven
$recency=exp(-\gamma\cdot t)=e^{-\gamma t}$
(2) calculate $\gamma=-log(recency)/t$
Example: set $\gamma$ such that recency = 0.01 after $t$ = 180 days
gamma <- -log(0.01)/180
print(gamma)
0.02558428
recency_fun <- function(t, gamma, auth_cd, freq_auth) {
n_t <- length(t) if (freq_auth[n_t] == 0) { recency <- 0 # recency = 0 when frequency = 0
} else { time_diff <- t[1] - max(t[2:n_t][auth_cd[(n_t-1):1] == auth_cd[n_t]]) # time-interval = current time - time of previous transfer with same auth_cd
recency <- exp(-gamma * time_diff) } return(recency)
}
(1) Choose value for $\gamma$
gamma <- -log(0.01)/180 # = 0.0256
(2) Use rollapply()
, group_by()
and mutate()
library(dplyr) # needed for group_by() and mutate()
library(zoo) # needed for rollapply()
trans <- trans %>% group_by(account_name) %>%
mutate(rec_auth = rollapply(timestamp,
width = list(0:-length(transfer_id)),
partial = TRUE,
FUN = recency_fun,
gamma, authentication_cd, freq_auth))
account_name timestamp authentication_cd rec_auth fraud_flag
1 Bob 44.25 AU02 0.000 0
2 Alice 54.12 AU03 0.000 0
3 Bob 57.45 AU04 0.000 0
4 Bob 64.29 AU02 0.599 0
5 Alice 64.29 AU03 0.771 0
6 Bob 64.29 AU02 1.000 0
7 Alice 70.25 AU03 0.859 0
8 Bob 70.25 AU02 0.859 0
9 Alice 74.08 AU01 0.000 0
... ... ... ... ... ...
37 Bob 407.17 AU02 0.002 0
38 Alice 420.17 AU03 0.717 0
39 Bob 441.34 AU03 0.000 1
40 Alice 443.24 AU04 0.000 1
Fraud Detection in R