Counting categorical data

Introduction to Text Analysis in R

Maham Faisal Khan

Senior Data Science Content Developer

Column types

review_data
# A tibble: 1,833 x 4
   date     product               stars review  
   <chr>    <chr>                 <dbl> <chr>
 1 2/28/15  iRobot Roomba 650 fo…     5 You would not believe how well...
 2 1/12/15  iRobot Roomba 650 fo…     4 You just walk away and it does...
 3 12/26/13 iRobot Roomba 650 fo…     5 You have to Roomba proof your...
 4 8/4/13   iRobot Roomba 650 fo…     3 Yes, its a fascinating, albeit...
 5 12/22/15 iRobot Roomba 650 fo…     5 Years ago I bought one of the...
# … with 1,828 more rows
Introduction to Text Analysis in R

Summarizing with n()

review_data %>% 
  summarize(number_rows = n())
# A tibble: 1 x 1
  number_rows
       <int>
1       1833
Introduction to Text Analysis in R

Summarizing with n()

review_data %>% 
  group_by(product) %>% 
  summarize(number_rows = n())
# A tibble: 2 x 2
  product                                  number_rows
  <chr>                                         <int>
1 iRobot Roomba 650 for Pets                      633
2 iRobot Roomba 880 for Pets and Allergies       1200
Introduction to Text Analysis in R

Summarizing with count()

review_data %>% 
  count(product)
# A tibble: 2 x 2
  product                                      n
  <chr>                                    <int>
1 iRobot Roomba 650 for Pets                 633
2 iRobot Roomba 880 for Pets and Allergies  1200
Introduction to Text Analysis in R

Summarizing with count()

review_data %>% 
  count(product) %>% 
  arrange(desc(n))
# A tibble: 2 x 2
  product                                      n
  <chr>                                    <int>
1 iRobot Roomba 880 for Pets and Allergies  1200
2 iRobot Roomba 650 for Pets                 633
Introduction to Text Analysis in R

Let's practice!

Introduction to Text Analysis in R

Preparing Video For Download...