Processing pipelines

Advanced NLP with spaCy

Ines Montani

spaCy core developer

What happens when you call nlp?

Illustration of the spaCy pipeline transforming a text into a processed Doc

doc = nlp("This is a sentence.")
Advanced NLP with spaCy

Built-in pipeline components

Name Description Creates
tagger Part-of-speech tagger Token.tag
parser Dependency parser Token.dep, Token.head, Doc.sents, Doc.noun_chunks
ner Named entity recognizer Doc.ents, Token.ent_iob, Token.ent_type
textcat Text classifier Doc.cats
Advanced NLP with spaCy

Under the hood

Illustration of a package labelled en_core_web_sm, folders and file and the meta.json

  • Pipeline defined in model's meta.json in order
  • Built-in components need binary data to make predictions
Advanced NLP with spaCy

Pipeline attributes

  • nlp.pipe_names: list of pipeline component names
print(nlp.pipe_names)
['tagger', 'parser', 'ner']
  • nlp.pipeline: list of (name, component) tuples
print(nlp.pipeline)
[('tagger', <spacy.pipeline.Tagger>),
 ('parser', <spacy.pipeline.DependencyParser>),
 ('ner', <spacy.pipeline.EntityRecognizer>)]
Advanced NLP with spaCy

Let's practice!

Advanced NLP with spaCy

Preparing Video For Download...