Components of twitter data

Analyzing Social Media Data in R

Sowmya Vivek

Data Science Coach

Lesson Overview

  • Introduction to twitter JSON
  • Extract components of metadata from the JSON
  • Use components to derive insights
Analyzing Social Media Data in R

Twitter JSON

  • A tweet can have over 150 metadata components
  • Tweets and their components returned as JavaScript Object Notation
Analyzing Social Media Data in R

JSON attributes and values

  • Attributes and values to describe tweets and components
  • Example: screen_name stores the twitter handle of a user

Twitter JSON attributes and values

Analyzing Social Media Data in R

Converting JSON to a dataframe

  • Twitter JSON converted to dataframe by rtweet library
  • Attributes and values converted to column names and values

JSON attributes converted to dataframe columns

Analyzing Social Media Data in R

Viewing components of tweets

# Extract tweets on "#brexit" using search_tweets()
tweets_df <- search_tweets("#brexit")
# View the column names
names(tweets_df)
Analyzing Social Media Data in R

Viewing components of tweets

Components of a tweet

Analyzing Social Media Data in R

Exploring components

  • screen_name to understand user interest
  • followers_count to compare social media influence
  • retweet_count and text to identify popular tweets
Analyzing Social Media Data in R

User interest and tweet counts

  • screen_name refers to the twitter handle
  • Number of tweets posted indicate interest in a topic
  • Promote products to interested users
Analyzing Social Media Data in R

User interest and tweet counts

# Extract tweets on "#Arsenal" using search_tweets()
twts_arsnl <- search_tweets("#Arsenal", n = 18000)
# Create a table of users and tweet counts for the topic
sc_name <- table(twts_arsnl$screen_name)
head(sc_name)
_____today_____   ___JJ23    ___SAbI__   __ambell   __Amzo__     __bobbysingh 
       1               2          3           1         1               1
Analyzing Social Media Data in R

User interest and tweet counts

# Sort the table in descending order of tweet counts
sc_name_sort <- sort(sc_name, decreasing = TRUE)
# View top 6 users and tweet frequencies
head(sc_name_sort)
 _whatthesport  footy90com  Official_ATG1   TheShortFuse   RubellM   ArsenalZone_Ind 
      176           90            88             53           48            43
Analyzing Social Media Data in R

Follower count

  • Count of followers subscribed to a twitter account
  • Indicates popularity of the account
  • A measure of influence in social media
  • Position ads on popular accounts for increased visibility
Analyzing Social Media Data in R

Compare follower count

# Extract user data using lookup_users()
tvseries <- lookup_users("GameOfThrones", "fleabag", "BreakingBad")
# Create a dataframe with the columns screen_name and followers_count
user_df <- tvseries[,c("screen_name","followers_count")]
Analyzing Social Media Data in R

Compare follower count

# View the followers count for comparison
user_df
screen_name        followers_count
<chr>                   <int>
GameOfThrones          8597188            
fleabag                  58727            
BreakingBad            1240349
Analyzing Social Media Data in R

Retweet counts and popular tweets

  • A retweet is a tweet re-shared by another user
  • retweet_count stores number of retweets
  • Number of retweets helps identify trends
  • Popular retweets can be used to promote a brand
Analyzing Social Media Data in R

Retweet counts and popular tweets

# Create a data frame of tweet text and retweet counts
rtwt <- tweets_arsenal[,c("text", "retweet_count")]
# Sort data frame based on descending order of retweet counts
library(dplyr)
rtwt_sort <- arrange(rtwt, desc(retweet_count))
Analyzing Social Media Data in R

Retweet counts and popular tweets

# Exclude rows with duplicate tweet text
rtwt_unique <- unique(rtwt_sort, by = "text")
Analyzing Social Media Data in R

Retweet counts and popular tweets

# Print top 6 unique posts retweeted most number of times
head(rtwt_unique)
retweet_count                text
<int>                        <chr>
5606            Once a Gunner, Always a Gunner. We are proud of you @alexanderiwob
3764            Emirates on Fire ???????????????? Never give up Gunners???????????????????????? #Arsenal #CO
2798            That mood tonight ?????? 3?? POINTS ?????? #Arsenal #Gunners #COYG h
2741            #Arsenal fan: "I reckon we'll win the League this season." @Robbie
1687            Auba ???????????????? This is what I call happiness #aubameyang #arsenal
1166            When sky sports introduced the new Monday night football! The Sha
Analyzing Social Media Data in R

Let's practice!

Analyzing Social Media Data in R

Preparing Video For Download...