Running topic models

Introduction to Text Analysis in R

Maham Faisal Khan

Senior Data Science Content Developer

Using LDA()

library(topicmodels)

lda_out <- LDA( dtm_review, k = 2, method = "Gibbs", control = list(seed = 42) )
Introduction to Text Analysis in R

LDA() output

lda_out
A LDA_Gibbs topic model with 2 topics.
Introduction to Text Analysis in R

Using glimpse()

glimpse(lda_out)
Formal class 'LDA_Gibbs' [package "topicmodels"] with 16 slots
  ..@ seedwords      : NULL
  ..@ z              : int [1:75670] 1 2 2 1 1 2 1 1 2 2 ...
  ..@ alpha          : num 25
  ..@ call           : language LDA(x = dtm_review, k = 2, method = "Gibbs", ...
  ..@ Dim            : int [1:2] 1791 9668
  ..@ control        :Formal class 'LDA_Gibbscontrol' [package "topicmodels"] ...
  ..@ beta           : num [1:2, 1:17964] -8.81 -10.14 -9.09 -8.43 -12.53 ...
  ...
Introduction to Text Analysis in R

Using tidy()

lda_topics <- lda_out %>% 
  tidy(matrix = "beta")

lda_topics %>% arrange(desc(beta))
# A tibble: 19,336 x 3
   topic term       beta
   <int> <chr>     <dbl>
 1     1 hair     0.0241
 2     2 clean    0.0231
 3     2 cleaning 0.0201
# … with 19,333 more rows
Introduction to Text Analysis in R

Let's practice!

Introduction to Text Analysis in R

Preparing Video For Download...