Extracting twitter data

Analyzing Social Media Data in R

Sowmya Vivek

Data Science Coach

Lesson Overview

  • API fundamentals
  • Twitter API types
  • Setup the R environment
  • Extract data from twitter
Analyzing Social Media Data in R

API explained

  • Application Programming Interface
  • Software intermediary that allows two applications to talk to each other
  • Twitter APIs interact with twitter and help access tweets
Analyzing Social Media Data in R

API-based subscriptions

Standard API

Analyzing Social Media Data in R

API-based subscriptions

Premium and Enterprise APIs

Analyzing Social Media Data in R

Prerequisites to set up R

  • Prerequisites to set up R in your computer
    • A twitter account
    • Pop-up blocker disabled in the browser
    • Interactive R session
    • rtweet and httpuv packages installed in R
  • All prerequisites have been setup within the DataCamp interface
Analyzing Social Media Data in R

The rtweet and httpuv packages

rtweet package

httpuv package

Analyzing Social Media Data in R

Setting up the R environment

  • Steps to set up the R environment in your computer
    • rtweet and httpuv libraries activated
    • search_tweets() function with a search query to connect with twitter
    • Authorize access via browser pop-up
    • "Authentication complete" confirms authorization of twitter access
  • R environment has already been setup within the DataCamp interface
Analyzing Social Media Data in R

Extract twitter data: search_tweets()

  • search_tweets() returns twitter data matching a search query
  • Tweets from the past 7 days only
  • Maximum of 18,000 tweets returned per request
# Load the rtweet library
library(rtweet)
# Extract tweets on "#gameofthrones" using search_tweets()
tweets_got <- search_tweets("#gameofthrones", n = 1000, include_rts = TRUE, lang = "en")
Analyzing Social Media Data in R

Extract twitter data: search_tweets()

head(tweets_got, 4)
  user_id                status_id               created_at         screen_name            text
    <chr>                  <chr>                <S3: POSIXct>          <chr>               <chr>
727816588171350017    1176103860554915841    2019-09-23 11:59:45    LeonardoUzcat1    Today.\n\n#GameofThrones has won Outstanding Drama Series at this year's #Emmys. https://t.co/YHlqvmKxLF    
363838927             1176103859464396806    2019-09-23 11:59:45    mariaaa_carmen    We break the wheel together.\n\n#GameofThrones has been awarded 12 #Emmys, the most of any program this year. https://t.co/gTYq8JWCtD    
881880538461618176    1176103856163434497    2019-09-23 11:59:44    _valkyriez        The #Emmys had their chance with Lena for Seasons 5 &amp; 6 and they blew it. It happens. However, the work Lena Heady brought to the role of Cersei Lannister will never go unnoticed in our hearts &amp; memories. One of the great villains &amp; characters in TV history. #GameOfThrones https://t.co/vpznpzAMsP    
521127287             1176103856075431936    2019-09-23 11:59:44    Nudeus            Congrats to #GameofThrones (60%), the most nominated show in a single season. It wins this year’s #Emmy for Outstanding Drama Series. https://t.co/lSoQE6PjDY https://t.co/IUR6kkI5FL
Analyzing Social Media Data in R

Extract twitter data: get_timeline()

  • get_timeline() extracts tweets posted by a specific twitter user
  • Returns upto 3200 tweets
# Extract tweets of Katy Perry using get_timeline()
gt_katy <- get_timeline("@katyperry", n = 3200)
Analyzing Social Media Data in R

Extract twitter data: get_timeline()

# View the output
head(gt_katy)
user_id         status_id               created_at        screen_name                    text
<chr>             <chr>                <S3: POSIXct>          <chr>                      <chr>
21447363    1175132444103565312    2019-09-20 19:39:42     katyperry    My baby angel @cynthialovely is a MOOD. #MoodSwing EP is out now! ????Angel’s???? my favorite - which one is yours? https://t.co/VDgri3RYQv https://t.co/XhgCHTJG2o        
21447363    1175033932355649536    2019-09-20 13:08:15     katyperry    CHICAGO! I’m going to make it a Cozy Little Christmas with you (and @Camila_Cabello, @marshmellomusic, @Normani, @OfficialMonstaX, @liltecca, @ajmitchell, and @NCTsmtown_127) when I see you at the @B96Chicago #JingleBash on December 7! Get your ???? here: https://t.co/gBEaNYiZyR https://t.co/Qc5FjR28ti    
21447363    1174461907656273920    2019-09-18 23:15:13     katyperry    I still dress like a child to offset adulting ???????????? https://t.co/dvzJBTL5G6        
21447363    1174428616735756288    2019-09-18 21:02:56     katyperry    watch me perform ????Small Talk???? on theellenshow today for clear skin ? @ The Ellen Show https://t.co/WpLUl33YiA        
21447363    1174381476227338240    2019-09-18 17:55:37     katyperry    ???? #SmallTalk ???? with my friend @TheEllenShow TODAY. Ch    
21447363    1174061536580497409    2019-09-17 20:44:17     katyperry    Make a ???? connection with @katyperrycollections this #Shoe
Analyzing Social Media Data in R

Let's practice!

Analyzing Social Media Data in R

Preparing Video For Download...