R'de Atamaya Dayalı Eksik Veri Doldurma
Michal Oleszak
Machine Learning Engineer

Çoğu istatistiksel model, yanıt değişkeninin koşullu dağılımını tahmin eder:
$p(y|X)$
Tek bir tahmin için, koşullu dağılım özetlenir:
Bunun yerine, değişkenliği artırmak için bu dağılımlardan örnekleyebiliriz.


Görev: nhanes verilerinden PhysActive değişkenini lojistik regresyon ile atayın.
nhanes_imp <- hotdeck(nhanes)
missing_physactive <- is.na(nhanes$PhysActive)
Görev: nhanes verilerinden PhysActive değişkenini lojistik regresyon ile atayın.
nhanes_imp <- hotdeck(nhanes)
missing_physactive <- is.na(nhanes$PhysActive)
logreg_model <- glm(PhysActive ~ Age + Weight + Pulse,
data = nhanes_imp, family = binomial)
Görev: nhanes verilerinden PhysActive değişkenini lojistik regresyon ile atayın.
nhanes_imp <- hotdeck(nhanes)
missing_physactive <- is.na(nhanes$PhysActive)
logreg_model <- glm(PhysActive ~ Age + Weight + Pulse,
data = nhanes_imp, family = binomial)
preds <- predict(logreg_model, type = "response")
Görev: nhanes verilerinden PhysActive değişkenini lojistik regresyon ile atayın.
nhanes_imp <- hotdeck(nhanes)
missing_physactive <- is.na(nhanes$PhysActive)
logreg_model <- glm(PhysActive ~ Age + Weight + Pulse,
data = nhanes_imp, family = binomial)
preds <- predict(logreg_model, type = "response")
preds <- ifelse(preds >= 0.5, 1, 0)
Görev: nhanes verilerinden PhysActive değişkenini lojistik regresyon ile atayın.
nhanes_imp <- hotdeck(nhanes)
missing_physactive <- is.na(nhanes$PhysActive)
logreg_model <- glm(PhysActive ~ Age + Weight + Pulse,
data = nhanes_imp, family = binomial)
preds <- predict(logreg_model, type = "response")
preds <- ifelse(preds >= 0.5, 1, 0)
nhanes_imp[missing_physactive, "PhysActive"] <- preds[missing_physactive]
Atanan verinin değişkenliği:
table(preds[missing_physactive])
1
26
Gözlenen PhysActive verisinin değişkenliği:
table(nhanes$PhysActive)
0 1
181 610
nhanes_imp <- hotdeck(nhanes)
missing_physactive <- is.na(nhanes$PhysActive)
logreg_model <- glm(PhysActive ~ Age + Weight + Pulse,
data = nhanes_imp, family = binomial)
preds <- predict(logreg_model, type = "response")
preds <- ifelse(preds >= 0.5, 1, 0)
nhanes_imp[missing_physactive, "PhysActive"] <- preds[missing_physactive]
nhanes_imp <- hotdeck(nhanes)
missing_physactive <- is.na(nhanes$PhysActive)
logreg_model <- glm(PhysActive ~ Age + Weight + Pulse,
data = nhanes_imp, family = binomial)
preds <- predict(logreg_model, type = "response")
nhanes_imp[missing_physactive, "PhysActive"] <- preds[missing_physactive]
nhanes_imp <- hotdeck(nhanes)
missing_physactive <- is.na(nhanes$PhysActive)
logreg_model <- glm(PhysActive ~ Age + Weight + Pulse,
data = nhanes_imp, family = binomial)
preds <- predict(logreg_model, type = "response")
preds <- rbinom(length(preds), size = 1, prob = preds)
nhanes_imp[missing_physactive, "PhysActive"] <- preds[missing_physactive]
Atanan verinin değişkenliği:
table(preds[missing_physactive])
0 1
5 21
Gözlenen PhysActive verisinin değişkenliği:
table(nhanes$PhysActive)
0 1
181 610
R'de Atamaya Dayalı Eksik Veri Doldurma