Training and updating models

Advanced NLP with spaCy

Ines Montani

spaCy core developer

Why updating the model?

Better results on your specific domain
Learn classification schemes specifically for your problem
Essential for text classification
Very useful for named entity recognition
Less critical for part-of-speech tagging and dependency parsing

How training works (1)

Initialize the model weights randomly with nlp.begin_training
Predict a few examples with the current weights by calling nlp.update
Compare prediction with true labels
Calculate how to change weights to improve predictions
Update weights slightly
Go back to 2.

How training works (2)

Diagram of the training process

Training data: Examples and their annotations.
Text: The input text the model should predict a label for.
Label: The label the model should predict.
Gradient: How to change the weights.

Example: Training the entity recognizer

The entity recognizer tags words and phrases in context
Each token can only be part of one entity
Examples need to come with context

("iPhone X is coming", {'entities': [(0, 8, 'GADGET')]})

Texts with no entities are also important

("I need a new phone! Any tips?", {'entities': []})

Goal: teach the model to generalize

The training data

Examples of what we want the model to predict in context
Update an existing model: a few hundred to a few thousand examples
Train a new category: a few thousand to a million examples
- spaCy's English models: 2 million words
Usually created manually by human annotators
Can be semi-automated – for example, using spaCy's Matcher!

Let's practice!

Advanced NLP with spaCy

Preparing Video For Download...