Dimensionality Reduction in R
Matt Pickard
Owner, Pickard Predictives, LLC
pca_res <- prcomp(attrition_df %>% select(-Attrition), scale. = TRUE)
summary(pca_res)
Importance of components:
PC1 PC2 PC3 PC4 PC5
Standard deviation 1.4259 1.3295 0.8618 0.48401 0.47138
Proportion of Variance 0.4067 0.3535 0.1485 0.04685 0.04444
Cumulative Proportion 0.4067 0.7602 0.9087 0.95556 1.00000
pca_res
Standard deviations (1, .., p=5):
[1] 1.43 1.33 0.86 0.48 0.47
Rotation (n x k) = (5 x 5):
PC1 PC2 PC3 PC4 PC5
MonthlyIncome 0.6244 -0.024 0.3665 -0.280 -0.630
TotalWorkingYears 0.6390 -0.011 0.2674 0.293 0.659
YearsSinceLastPromotion 0.4488 0.018 -0.8902 -0.061 -0.047
PercentSalaryHike -0.0018 0.707 0.0426 -0.647 0.284
PerformanceRating 0.0210 0.707 -0.0033 0.643 -0.294
pca_recipe <- recipe(Attrition ~ . , data = train) %>% step_normalize(all_numeric_predictors()) %>% step_pca(all_numeric_predictors(), num_comp = 2)
attrition_fit <- workflow(preprocessor = pca_recipe, spec = logistic_reg()) %>% fit(train)
attrition_pred_df <- predict(attrition_fit, test) %>% bind_cols(test %>% select(Attrition))
f_meas(attrition_pred_df, Attrition, .pred_class)
attrition_fit
Call: stats::glm(formula = ..y ~ ., family = stats::binomial, data = data)
Coefficients:
(Intercept) PC1 PC2
-2.42067 0.80493 -0.03429
Degrees of Freedom: 1339 Total (i.e. Null); 1337 Residual
Null Deviance: 951.8
Residual Deviance: 870.6 AIC: 876.6
Dimensionality Reduction in R