How many words do YOU know? Zipf's law & subjectivity lexicon

R ile Duygu Analizi

Ted Kwartler

Data Dude

Subjectivity lexicon

library(qdap)
library(magrittr)

text_df %$% polarity(text)

Returns a "polarity" object with positive and negative scores.

A subjectivity lexicon is a predefined list of words associated with emotional context such as positive/negative, or specific emotions like "frustration" or "joy."

R ile Duygu Analizi

Where to get subjectivity lexicons?

  • qdap's polarity() function uses a lexicon from hash_sentiment_huliu

  • tidytext has a sentiments tibble with

    • NRC - Words according to 8 emotions like "angry" or "joy" and Pos/Neg
    • Bing - Words labeled positive or negative
    • AFINN - Words scored from -5 to 5
R ile Duygu Analizi

library(lexicon)

Name Description
dodds_sentiment Mechanical Turk Sentiment Words
hash_emoticons Translations of basic punctuation emoticons :)
hash_sentiment_huliu U of IL @CHI Polarity (+/-) word research
hash_sentiment_jockers A lexicon inherited from library(syuzhet)
hash_sentiment_nrc 5468 words crowdsourced scoring between -1 & 1
R ile Duygu Analizi

No way! Too few words.

thinking

  • Zipf's Law
  • Principle of Least Effort
R ile Duygu Analizi

Zipf's Law in action

Rank City 2010 Census Population Actual % Zipf's Expected %
1 New York 8,175,133 100% ...
2 LA 3,792,621 46% 50%
3 Chicago 2,695,598 33% 33%
4 Houston 2,100,263 26% 25%
5 Philadelphia 1,526,006 19% 20%
R ile Duygu Analizi

Principle of Least Effort

If there are several ways of achieving the same goal, people will choose the least demanding course of action

lazy cat

R ile Duygu Analizi

Up next...

twitter logo

football

R ile Duygu Analizi

Let's practice!

R ile Duygu Analizi

Preparing Video For Download...