Factor Analysis in R
Jennifer Brussow
Psychometrician
1 = Very Inaccurate ... 6 = Very Accurate
head(bfi)
A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 ...
61617 2 4 3 4 4 2 3 3 4 4 3 3 3 4 4 3 4 2 2 3 3 ...
61618 2 4 5 2 5 5 4 4 3 4 1 1 6 4 3 3 3 3 5 5 4 ...
61620 5 4 5 4 4 4 5 4 2 5 2 4 4 4 5 4 5 4 2 3 4 ...
61621 4 4 6 5 5 4 4 3 5 5 5 3 4 4 4 2 5 2 4 1 3 ...
61622 2 3 3 4 5 4 4 5 3 2 2 2 5 4 5 2 3 4 4 3 3 ...
61623 6 6 5 6 5 6 6 6 1 3 2 1 6 5 6 3 5 2 2 3 4 ...
names(bfi)
"A1" "A2" "A3" "A4" "A5" "C1" "C2" "C3" "C4" "C5" "E1" "E2"
"E3" "E4" "E5" "N1" "N2" "N3" "N4" "N5" "O1" "O2" "O3" "O4" "O5"
# Establish two sets of indices to split the dataset
N <- nrow(bfi)
indices <- seq(1, N)
indices_EFA <- sample(indices, floor((.5*N)))
indices_CFA <- indices[!(indices %in% indices_EFA)]
# Use those indices to split the dataset into halves for your EFA and CFA
bfi_EFA <- bfi[indices_EFA, ]
bfi_CFA <- bfi[indices_CFA, ]
head(bfi_EFA, 2)
A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 ...
65237 3 4 4 4 4 4 4 5 2 3 3 4 NA 4 4 4 3 1 3 2 4 ...
61825 3 1 2 2 2 2 1 2 6 6 6 6 1 1 1 3 5 4 4 4 5 ...
head(bfi_CFA, 2)
A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 ...
61617 2 4 3 4 4 2 3 3 4 4 3 3 3 4 4 3 4 2 2 3 3 ...
61621 4 4 6 5 5 4 4 3 5 5 5 3 4 4 4 2 5 2 4 1 3 ...
...
Imagine we have no theory...
Without theory, use an empirical approach: Eigenvalues
# Calculate the correlation matrix first
bfi_EFA_cor <- cor(bfi_EFA, use = "pairwise.complete.obs")
A1 A2 A3 A4 A5 C1 ...
A1 1.00000000 -0.31920397 -0.25651343 -0.12441523 -0.20083692 0.058252
A2 -0.31920397 1.00000000 0.46698961 0.30599175 0.36599749 0.075002
A3 -0.25651343 0.46698961 1.00000000 0.32762347 0.47616038 0.089720
A4 -0.12441523 0.30599175 0.32762347 1.00000000 0.27182236 0.083987
A5 -0.20083692 0.36599749 0.47616038 0.27182236 1.00000000 0.116890
C1 0.05825219 0.07500228 0.08972097 0.08398741 0.11689059 1.000000
C2 0.04236764 0.12843266 0.10471200 0.22697628 0.09639765 0.421518
C3 -0.02289831 0.18618382 0.14009601 0.09975850 0.13797236 0.301556
C4 0.09865372 -0.11178917 -0.11576273 -0.15035049 -0.10248897 -0.354081
C5 0.04925038 -0.10820392 -0.15392300 -0.24998065 -0.15667123 -0.269701
...
# Calculate the correlation matrix first
bfi_EFA_cor <- cor(bfi_EFA, use = "pairwise.complete.obs")
# Then use that correlation matrix to create the scree plot
scree(bfi_EFA_cor, factors = FALSE)
# Calculate the correlation matrix first
bfi_EFA_cor <- cor(bfi_EFA, use = "pairwise.complete.obs")
# Then use that correlation matrix to create the scree plot
scree(bfi_EFA_cor, factors = FALSE)
Factor Analysis in R