Token classification

Natural Language Processing (NLP) in Python

Fouad Trad

Machine Learning Engineer

Text versus token classification

Text classification

  • Classifies entire sentences or pairs of texts

Image showing the tasks of text classification and QNLI, shown in previous videos.

Natural Language Processing (NLP) in Python

Text versus token classification

Text classification

  • Classifies entire sentences or pairs of texts

Image showing the tasks of text classification and QNLI, shown in previous videos.

Token classification

  • Assigns labels to tokens within a sentence

Image showing a sentence divided by words, each having a different color, representing a different class.

  • Named entity recognition (NER)
  • Part of speech (PoS) tagging
Natural Language Processing (NLP) in Python

Named entity recognition (NER)

  • Identifies entities like names, locations, organizations, dates, and more

Image showing NER analysis of the sentence "Apple opened a new office in Toronto in March 2023", with Apple being recognized as organization, Toronto as a location, and March 2023 as a date.

  • Useful in:
    • Information retrieval
    • Question answering
Natural Language Processing (NLP) in Python

NER in code

from transformers import pipeline

ner_pipeline = pipeline(task="ner",
model="dslim/bert-base-NER",
grouped_entities=True)
ner_results = ner_pipeline("Zara Venn established NovaCore Dynamics in London.")
print(ner_results)
[{'entity_group': 'PER', 'score': np.float32(0.99840075), 'word': 'Zara Venn', 'start': 0, 'end': 9}, 
 {'entity_group': 'ORG', 'score': np.float32(0.99875560), 'word': 'NovaCore Dynamics', 'start': 21, 'end': 38}, 
 {'entity_group': 'LOC', 'score': np.float32(0.99960726), 'word': 'London', 'start': 42, 'end': 48}]
Natural Language Processing (NLP) in Python

Part of speech (PoS) tagging

  • Assigns grammatical roles (noun, verb, adjective) to each word

Image showing the PoS tags for the sentence "The quick fox jumps over the lazy dog", where "the" is a determiner, "quick" and "lazy" are adjectives, "fox" and "dogs" are nouns, "jumps" is a verb, and "over" is a preposition.

  • Useful in:
    • Syntactic parsing
    • Grammar correction
    • Text generation
Natural Language Processing (NLP) in Python

PoS tagging in code

pos_pipeline = pipeline(task="token-classification",

model="vblagoje/bert-english-uncased-finetuned-pos",
grouped_entities=True)
pos_results = pos_pipeline("Zara Venn established NovaCore Dynamics in London.") print(pos_results)
[{'entity_group': 'PROPN', 'score': np.float32(0.9982983), 'word': 'zara venn', 'start': 0, 'end': 9},  
 {'entity_group': 'VERB', 'score': np.float32(0.99940944), 'word': 'established', 'start': 10, 'end': 21},  
 {'entity_group': 'PROPN', 'score': np.float32(0.99455726), 'word': 'novacore dynamics', 'start': 22, 'end': 39},  
 {'entity_group': 'ADP', 'score': np.float32(0.99935526), 'word': 'in', 'start': 40, 'end': 42},  
 {'entity_group': 'PROPN', 'score': np.float32(0.99847955), 'word': 'london', 'start': 43, 'end': 49}]
Natural Language Processing (NLP) in Python

Let's practice!

Natural Language Processing (NLP) in Python

Preparing Video For Download...