Visualize popular terms

Analyzing Social Media Data in R

Vivek Vijayaraghavan

Data Science Coach

Lesson Overview

  • Extract most frequent terms from the text corpus
  • Remove custom stop words and refine corpus
  • Visualize popular terms using bar plot and word cloud
Analyzing Social Media Data in R

Term frequency

  • Extract term frequency which is the number of occurrences of each word
# Extract term frequency
library(qdap)
term_count  <-  freq_terms(twt_corpus_final, 60)
term_count
Analyzing Social Media Data in R

Term frequency

Term frequency

Analyzing Social Media Data in R

Removing custom stop words

# Create a vector of custom stop words
custom_stop <- c("obesity", "can", "amp", "one", "like", "will", "just", 
                "many", "new", "know", "also", "need", "may", "now", 
                "get", "s", "t", "m", "re")
# Remove custom stop words
twt_corpus_refined <- tm_map(twt_corpus_final,removeWords, custom_stop)
Analyzing Social Media Data in R

Term count after refining corpus

# Term count after refining corpus
term_count_clean <- freq_terms(twt_corpus_refined, 20)
term_count_clean
Analyzing Social Media Data in R

Term frequency after refining corpus

Term frequency after refining corpus

  • Brand promoting an obesity management program can analyze these terms
Analyzing Social Media Data in R

Bar plot of popular terms

  • Create a bar plot of terms that occur more than 50 times
  • Bar plots summarize popular terms in an easily interpretable form
# Create a subset dataframe
term50 <- subset(term_count_clean, FREQ > 50)
Analyzing Social Media Data in R

Bar plot of most popular terms

library(ggplot2)
# Create a bar plot of frequent terms
ggplot(term50, aes(x = reorder(WORD,  -FREQ),  y = FREQ)) +
       geom_bar(stat = "identity", fill = "blue") + 
       theme(axis.text.x = element_text(angle = 45, hjust = 1))
Analyzing Social Media Data in R

Bar plot of popular terms

Bar plot of popular terms

Analyzing Social Media Data in R

Word cloud

  • Visualize the frequent terms using word clouds
  • Word cloud is an image made up of words
  • Size of each word indicates its frequency
  • Effective promotional image for campaigns
  • Communicates the brand messaging and highlights popular terms
Analyzing Social Media Data in R

Word cloud based on min frequency

  • The wordcloud() function helps create word clouds
# Create a word cloud based on min frequency
library(wordcloud)
wordcloud(twt_corpus_refined, min.freq = 20, colors = "red", 
          scale = c(3,0.5), random.order = FALSE)
Analyzing Social Media Data in R

Word cloud based on min frequency

Word cloud based on minimum frequency

Analyzing Social Media Data in R

Colorful word cloud

# Create a colorful word cloud
library(RColorBrewer)
wordcloud(twt_corpus_refined, max.words = 100, 
          colors = brewer.pal(6,"Dark2"), scale = c(2.5,.5),
          random.order = FALSE)
Analyzing Social Media Data in R

Colorful word cloud

Word cloud with different colors

Analyzing Social Media Data in R

Let's practice!

Analyzing Social Media Data in R

Preparing Video For Download...