Training with spaCy

Natural Language Processing with spaCy

Azadeh Mobasher

Principal Data Scientist

Training steps

 

  1. Annotate and prepare input data
  2. Disable other pipeline components
  3. Train a model for a few epochs
  4. Evaluate model performance
Natural Language Processing with spaCy

Disabling other pipeline components

 

  • Disable all pipeline components except NER:

 

other_pipes = [pipe for pipe in nlp.pipe_names if pipe != 'ner']

nlp.disable_pipes(*other_pipes)
Natural Language Processing with spaCy

Model training procedure

  • Go over the training set several times; one iteration is called an epoch.
  • In each epoch, update the weights of the model with a small number.
  • Optimizers update the model weights.
optimizer = nlp.create_optimizer()
losses = {}
for i in range(epochs):
  random.shuffle(training_data)

for text, annotation in training_data: doc = nlp.make_doc(text) example = Example.from_dict(doc, annotation)
nlp.update([example], sgd = optimizer, losses=losses)
Natural Language Processing with spaCy

Save and load a trained model

 

  • Save a trained NER model:
ner = nlp.get_pipe("ner")
ner.to_disk("<ner model name>")
  • Load the saved model:
    ner = nlp.create_pipe("ner")
    ner.from_disk("<ner model name>")
    nlp.add_pipe(ner, "<ner model name>")
    
Natural Language Processing with spaCy

Model for inference

 

  • Use a saved model at inference.

 

  • Apply NER model and store tuples of (entity text, entity label):
doc = nlp(text)
entities = [(ent.text, ent.label_) for ent in doc.ents]
Natural Language Processing with spaCy

Let's practice!

Natural Language Processing with spaCy

Preparing Video For Download...