Under-sampling acak

Deteksi Fraud di R

Bart Baesens

Professor Data Science at KU Leuven

Under-sampling acak (RUS)

Diagram batang kelas: under-sampling

Deteksi Fraud di R

Data asli: train vs test

Deteksi Fraud di R

Under-sampling acak v0

Deteksi Fraud di R

Under-sampling acak v1

Deteksi Fraud di R

V2 vs V1 pada data timpang

Deteksi Fraud di R
table(creditcard$Class)
    0     1 
24108   492
n_fraud <- 492
new_frac_fraud <- 0.50
new_n_total <- n_fraud / new_frac_fraud ## = 492 / 0.50 = 984

library(ROSE) undersampling_result <- ovun.sample(formula = Class ~ ., data = creditcard, method = "under", N = new_n_total, seed = 2018)
undersampled_credit <- undersampling_result$data
prop.table(table(undersampled_credit$Class))
  0   1 
0.5 0.5
Deteksi Fraud di R

V2 vs V1 pada data under-sampled

Deteksi Fraud di R

Lakukan keduanya!

Diagram batang kelas: keduanya

Deteksi Fraud di R
n_new <- nrow(creditcard) ## = 24600
fraction_fraud_new <- 0.50

sampling_result <- ovun.sample(formula = Class ~ ., data = creditcard, method = "both", N = n_new, p = fraction_fraud_new, seed = 2018) sampled_credit <- sampling_result$data
prop.table(table(sampled_credit$Class))
        0         1 
0.5039837 0.4960163
Deteksi Fraud di R

V2 vs V1: metode gabungan

Deteksi Fraud di R

Ayo berlatih!

Deteksi Fraud di R

Preparing Video For Download...