Analisis komponen utama (PCA)

Machine Learning dengan caret di R

Zach Mayer

Data Scientist at DataRobot and co-author of caret

Analisis komponen utama

  • Menggabungkan variabel ber-varians rendah dan saling berkorelasi
  • Satu set prediktor ortogonal dengan varians tinggi
  • Mencegah kolinearitas (korelasi antarprediktor)
Machine Learning dengan caret di R

PCA: representasi visual

  • Komponen pertama memiliki varians tertinggi
  • Komponen kedua memiliki varians tertinggi kedua
  • Dan seterusnya ...

pasted-image-919.png

Machine Learning dengan caret di R

Contoh: data blood-brain

  • Banyak prediktor
  • Banyak yang ber-varians rendah
# Muat dataset blood-brain
data(BloodBrain)
names(bbbDescr)[nearZeroVar(bbbDescr)]
[1] "negative"     "peoe_vsa.2.1" "peoe_vsa.3.1"
[4] "a_acid"       "vsa_acid"     "frac.anion7."
[7] "alert"  
Machine Learning dengan caret di R

Contoh: data blood-brain

# Model dasar
set.seed(42)
data(BloodBrain)
model <- train(
  bbbDescr, 
  logBBB, 
  method = "glm",
  trControl = trainControl(
    method = "cv", number = 10, verbose = TRUE
  ),
  preProcess = c("zv", "center", "scale")
)
min(model$results$RMSE)
1.107702     
Machine Learning dengan caret di R

Contoh: data blood-brain

# Hapus prediktor dengan varians rendah
set.seed(42)
data(BloodBrain)
model <- train(
  bbbDescr, 
  logBBB, 
  method = "glm",
  trControl = trainControl(
    method = "cv", number = 10, verbose = TRUE
  ),
  preProcess = c("nzv", "center", "scale")
)
min(model$results$RMSE)
0.9796199
Machine Learning dengan caret di R

Contoh: data blood-brain

# Tambahkan PCA
set.seed(42)
data(BloodBrain)
model <- train(
  bbbDescr, 
  logBBB, 
  method = "glm",
  trControl = trainControl(
    method = "cv", number = 10, verbose = TRUE
  ),
  preProcess = c("zv", "center", "scale", "pca")
)
min(model$results$RMSE)
0.9796199
Machine Learning dengan caret di R

Ayo berlatih!

Machine Learning dengan caret di R

Preparing Video For Download...