Statistical Models

Advanced NLP with spaCy

Ines Montani

spaCy core developer

What are statistical models?

  • Enable spaCy to predict linguistic attributes in context
    • Part-of-speech tags
    • Syntactic dependencies
    • Named entities
  • Trained on labeled example texts
  • Can be updated with more examples to fine-tune predictions
Advanced NLP with spaCy

Model Packages

A package with the label en_core_web_sm

import spacy

nlp = spacy.load('en_core_web_sm')
  • Binary weights
  • Vocabulary
  • Meta information (language, pipeline)
Advanced NLP with spaCy

Predicting Part-of-speech Tags

import spacy

# Load the small English model
nlp = spacy.load('en_core_web_sm')

# Process a text doc = nlp("She ate the pizza")
# Iterate over the tokens for token in doc:
# Print the text and the predicted part-of-speech tag print(token.text, token.pos_)
She PRON
ate VERB
the DET
pizza NOUN
Advanced NLP with spaCy

Predicting Syntactic Dependencies

for token in doc:
    print(token.text, token.pos_, token.dep_, token.head.text)
She PRON nsubj ate
ate VERB ROOT ate
the DET det pizza
pizza NOUN dobj ate
Advanced NLP with spaCy

Visualization of the dependency graph for "She ate the pizza"

Label Description Example
nsubj nominal subject She
dobj direct object pizza
det determiner (article) the
Advanced NLP with spaCy

Predicting Named Entities

Visualization of the named entities in "Apple is looking at buying U.K. startup for $1 billion"

# Process a text
doc = nlp(u"Apple is looking at buying U.K. startup for $1 billion")

# Iterate over the predicted entities for ent in doc.ents:
# Print the entity text and its label print(ent.text, ent.label_)
Apple ORG
U.K. GPE
$1 billion MONEY
Advanced NLP with spaCy

Tip: the explain method

Get quick definitions of the most common tags and labels.

spacy.explain('GPE')
Countries, cities, states'
spacy.explain('NNP')
'noun, proper singular'
spacy.explain('dobj')
'direct object'
Advanced NLP with spaCy

Let's practice!

Advanced NLP with spaCy

Preparing Video For Download...