Entity extraction

Building Chatbots in Python

Alan Nichol

Co-founder and CTO, Rasa

Beyond keywords: context

  • Keywords don't work for entities you haven't seen before
  • Use contextual clues:
    • Spelling
    • Capitalization
    • Words occurring before & after
  • Pattern recognition
Building Chatbots in Python

Pre-built Named Entity Recognition

import spacy

nlp = spacy.load('en')
doc = nlp("my friend Mary has worked at Google since 2009")
for ent in doc.ents: print(ent.text, ent.label_)
Mary PERSON
Google ORG
2009 DATE
Building Chatbots in Python

Roles

pattern_1 = re.compile('.* from (.*) to (.*)')

pattern_2 = re.compile('.* to (.*) from (.*)')
Building Chatbots in Python

doc = nlp('a flight to Shanghai from Singapore')

shanghai, singapore = doc[3], doc[5]
list(shanghai.ancestors)
[to, flight]
list(singapore.ancestors)
[from, flight]
Building Chatbots in Python

Shopping example

doc = nlp("let's see that jacket in red and some blue jeans")

items = [doc[4], doc[10]] # [jacket, jeans] colors = [doc[6], doc[9]] # [red, blue]
for color in colors: for tok in color.ancestors: if tok in items: print("color {} belongs to item {}".format(color, tok)) break
color red belongs to item jacket
color blue belongs to item jeans
Building Chatbots in Python

Let's practice!

Building Chatbots in Python

Preparing Video For Download...