Principal Component Analysis (PCA)

Dimensionality Reduction in R

Matt Pickard

Owner, Pickard Predictives, LLC

Performing a PCA

pca_res <- prcomp(attrition_df %>% select(-Attrition), scale. = TRUE)

summary(pca_res)
Importance of components:
                          PC1    PC2    PC3     PC4     PC5
Standard deviation     1.4259 1.3295 0.8618 0.48401 0.47138
Proportion of Variance 0.4067 0.3535 0.1485 0.04685 0.04444
Cumulative Proportion  0.4067 0.7602 0.9087 0.95556 1.00000
Dimensionality Reduction in R

PC loadings

pca_res
Standard deviations (1, .., p=5):
[1] 1.43 1.33 0.86 0.48 0.47

Rotation (n x k) = (5 x 5):
                            PC1    PC2     PC3    PC4    PC5
MonthlyIncome            0.6244 -0.024  0.3665 -0.280 -0.630
TotalWorkingYears        0.6390 -0.011  0.2674  0.293  0.659
YearsSinceLastPromotion  0.4488  0.018 -0.8902 -0.061 -0.047
PercentSalaryHike       -0.0018  0.707  0.0426 -0.647  0.284
PerformanceRating        0.0210  0.707 -0.0033  0.643 -0.294
Dimensionality Reduction in R

PC loadings

zoomed in on loading of first two principal components

Dimensionality Reduction in R

PC loadings

feature loadings of PC1

Dimensionality Reduction in R

PC loadings

features loadings of PC 2

Dimensionality Reduction in R

PCA with tidymodels

pca_recipe <- recipe(Attrition ~ . , data = train) %>% 
  step_normalize(all_numeric_predictors()) %>% 
  step_pca(all_numeric_predictors(), num_comp = 2)

attrition_fit <- workflow(preprocessor = pca_recipe, spec = logistic_reg()) %>% fit(train)
attrition_pred_df <- predict(attrition_fit, test) %>% bind_cols(test %>% select(Attrition))
f_meas(attrition_pred_df, Attrition, .pred_class)
Dimensionality Reduction in R

See the PCs in the model details

attrition_fit
Call:  stats::glm(formula = ..y ~ ., family = stats::binomial, data = data)

Coefficients:
(Intercept)          PC1          PC2  
   -2.42067      0.80493     -0.03429  

Degrees of Freedom: 1339 Total (i.e. Null);  1337 Residual
Null Deviance:        951.8 
Residual Deviance: 870.6     AIC: 876.6
Dimensionality Reduction in R

Let's practice!

Dimensionality Reduction in R

Preparing Video For Download...