Step 3: Organize (& clean) the text

Sentiment Analysis in R

Ted Kwartler

Data Dude

Get to it!

Initial goal: Use the polarity() function to define subsections of the text for examination.

pos_comments <- subset(bos_reviews$comments, 
                       bos_reviews$polarity > 0)
neg_comments <- subset(bos_reviews$comments, 
                       bos_reviews$polarity < 0)

pos_terms <- paste(pos_comments, collapse = " ")
neg_terms <- paste(neg_comments, collapse = " ")
Sentiment Analysis in R

More organization

Goal: Use the tidy rental reviews to create the tidy formatted polarity scoring.

library(tidytext)
library(dplyr)

tidy_reviews <- bos_reviews %>% 
    unnest_tokens(word, comments)

tidy_reviews <- tidy_reviews %>% 
    group_by(id) %>%
    mutate(original_word_order = seq_along(word))
Sentiment Analysis in R

Tidy text polarity scoring

Recall the "bing" lexicon in sentiments has words categorized either as positive or negative.

library(tidytext)
library(tidyr)
library(dplyr)

bing <- sentiments %>% 
    filter(lexicon == "bing")

pos_neg <- tidy_reviews %>% 
    inner_join(bing) %>%
    count(sentiment) %>%
    pivot_wider(names_from = sentiment, values_from = n, values_fill = 0) %>%
    mutate(polarity = positive - negative)
Sentiment Analysis in R

Let's practice!

Sentiment Analysis in R

Preparing Video For Download...