Named Entity Recognition

Introduction to Natural Language Processing in Python

Katharine Jarmul

Founder, kjamistan

What is Named Entity Recognition?

  • NLP task to identify important named entities in the text
    • People, places, organizations
    • Dates, states, works of art
    • ... and other categories!
  • Can be used alongside topic identification
    • ... or on its own!
  • Who? What? When? Where?
Introduction to Natural Language Processing in Python

Example of NER

NER on wikipedia article

(Source: Europeana Newspapers (http://www.europeana-newspapers.eu))

Introduction to Natural Language Processing in Python

nltk and the Stanford CoreNLP Library

  • The Stanford CoreNLP library:
    • Integrated into Python via nltk
    • Java based
    • Support for NER as well as coreference and dependency trees
Introduction to Natural Language Processing in Python

Using nltk for Named Entity Recognition

import nltk
sentence = '''In New York, I like to ride the Metro to 
              visit MOMA and some restaurants rated 
              well by Ruth Reichl.'''
tokenized_sent = nltk.word_tokenize(sentence)

tagged_sent = nltk.pos_tag(tokenized_sent)
tagged_sent[:3]
[('In', 'IN'), ('New', 'NNP'), ('York', 'NNP')]
Introduction to Natural Language Processing in Python
print(nltk.ne_chunk(tagged_sent))
(S
  In/IN
  (GPE New/NNP York/NNP) 
  ,/,
  I/PRP
  like/VBP
  to/TO
  ride/VB
  the/DT
  (ORGANIZATION Metro/NNP)
  to/TO
  visit/VB
  (ORGANIZATION MOMA/NNP)
  and/CC
  some/DT
  restaurants/NNS
  rated/VBN
  well/RB
  by/IN
  (PERSON Ruth/NNP Reichl/NNP) 
  ./.)
Introduction to Natural Language Processing in Python

Let's practice!

Introduction to Natural Language Processing in Python

Preparing Video For Download...