The training loop

spaCy ile İleri Düzey NLP

Ines Montani

spaCy core developer

The steps of a training loop

  1. Loop for a number of times.
  2. Shuffle the training data.
  3. Divide the data into batches.
  4. Update the model for each batch.
  5. Save the updated model.
spaCy ile İleri Düzey NLP

Recap: How training works

Diagram of the training process

  • Training data: Examples and their annotations.
  • Text: The input text the model should predict a label for.
  • Label: The label the model should predict.
  • Gradient: How to change the weights.
spaCy ile İleri Düzey NLP

Example loop

TRAINING_DATA = [
    ("How to preorder the iPhone X", {'entities': [(20, 28, 'GADGET')]})
    # And many more examples...
]
# Loop for 10 iterations
for i in range(10):

# Shuffle the training data random.shuffle(TRAINING_DATA)
# Create batches and iterate over them for batch in spacy.util.minibatch(TRAINING_DATA):
# Split the batch in texts and annotations texts = [text for text, annotation in batch] annotations = [annotation for text, annotation in batch]
# Update the model nlp.update(texts, annotations)
# Save the model nlp.to_disk(path_to_model)
spaCy ile İleri Düzey NLP

Updating an existing model

  • Improve the predictions on new data
  • Especially useful to improve existing categories, like PERSON
  • Also possible to add new categories
  • Be careful and make sure the model doesn't "forget" the old ones
spaCy ile İleri Düzey NLP

Setting up a new pipeline from scratch

# Start with blank English model
nlp = spacy.blank('en')

# Create blank entity recognizer and add it to the pipeline ner = nlp.create_pipe('ner') nlp.add_pipe(ner)
# Add a new label ner.add_label('GADGET')
# Start the training nlp.begin_training()
# Train for 10 iterations for itn in range(10): random.shuffle(examples)
# Divide examples into batches for batch in spacy.util.minibatch(examples, size=2): texts = [text for text, annotation in batch] annotations = [annotation for text, annotation in batch]
# Update the model nlp.update(texts, annotations)
spaCy ile İleri Düzey NLP

Let's practice!

spaCy ile İleri Düzey NLP

Preparing Video For Download...