Overview of the measure development process

Factor Analysis in R

Jennifer Brussow

Psychometrician

Development process

Develop items for your measure
Collect pilot data from a representative sample
Check out what that dataset looks like
Consider whether you want to use EFA, CFA, or both
If both, split your sample into random halves
Compare the two samples to make sure they are similar

Development process

Develop items for your measure
Collect pilot data from a representative sample
Check out what that dataset looks like

Inspecting your dataset

library(psych)
describe(gcbs)

    vars    n mean   sd median trimmed  mad min max range  skew ...
Q1     1 2495 3.47 1.46      4    3.59 1.48   0   5     5 -0.55 ...
Q2     2 2495 2.96 1.49      3    2.96 1.48   0   5     5 -0.01 ...
Q3     3 2495 2.05 1.39      1    1.82 0.00   0   5     5  0.98 ...
Q4     4 2495 2.64 1.45      2    2.55 1.48   0   5     5  0.26 ...
Q5     5 2495 3.25 1.47      4    3.32 1.48   0   5     5 -0.35 ...
...
Q11   11 2495 3.27 1.40      4    3.34 1.48   0   5     5 -0.35 ...
Q12   12 2495 2.64 1.50      2    2.56 1.48   0   5     5  0.29 ...
Q13   13 2495 2.10 1.38      1    1.89 0.00   0   5     5  0.89 ...
Q14   14 2495 2.96 1.49      3    2.95 1.48   0   5     5 -0.02 ...
Q15   15 2495 4.23 1.10      5    4.47 0.00   0   5     5 -1.56 ...

Development process

Develop items for your measure
Collect pilot data from a representative sample
Check out what that dataset looks like
Consider whether you want to use an exploratory analysis (EFA), a confirmatory analysis (CFA), or both
If both, split your sample into random halves

Splitting the dataset

N <- nrow(gcbs)
indices <- seq(1, N)
indices_EFA <- sample(indices, floor((0.5 * N)))
indices_CFA <- indices[!(indices %in% indices_EFA)]

gcbs_EFA <- gcbs[indices_EFA, ]
gcbs_CFA <- gcbs[indices_CFA, ]

Development process

Develop items for your measure
Collect pilot data from a representative sample
Check out what that dataset looks like
Consider whether you want to use EFA, CFA, or both
If both, split your sample into random halves
Compare the two samples to make sure they are similar

Inspecting the halves

group_var <- vector("numeric", nrow(gcbs))
group_var[indices_EFA] <- 1
group_var[indices_CFA] <- 2
group_var

   [1] 2 1 2 2 1 2 1 1 2 2 2 1 2 2 1 1 2 1 1 1 1 2 1 1 2 1 1 1 2 2
  [31] 2 2 2 1 2 2 2 1 2 2 2 1 1 1 2 2 2 2 1 2 2 1 1 2 2 2 2 2 2 2
  [61] 2 1 2 1 2 2 1 2 1 2 2 2 1 2 1 2 1 1 2 2 1 2 1 2 1 1 1 2 2 2
  [91] 2 2 2 1 2 2 2 2 2 2 2 2 1 2 2 2 1 2 2 2 2 1 1 1 2 2 1 1 2 2
 [121] 2 1 2 2 1 2 2 1 2 2 2 2 1 2 1 1 1 2 2 1 1 1 2 1 1 1 1 2 2 2
 [151] 1 1 1 1 2 2 2 2 2 1 2 1 1 2 1 1 2 1 2 1 2 1 1 1 2 1 1 1 1 2
 [181] 2 1 1 2 2 2 1 1 1 1 2 2 2 2 2 1 1 1 1 2 2 1 1 1 2 1 2 1 2 2

Inspecting the halves

gcbs_grouped <- cbind(gcbs, group_var)

describeBy(gcbs_grouped, group = group_var)
statsBy(gcbs_grouped, group = "group_var")

Let's practice!

Factor Analysis in R