Text classification

Working with Hugging Face

Jacob H. Marquez

Lead Data Engineer

Text classification: Sentiment analysis

$$

  • Labels text based on its emotional tone

$$

Sentiment analysis

$$

  • Applications: Analyzing reviews, tracking social media sentiment

Sentiment icon

Working with Hugging Face

Sentiment analysis: coding example

from transformers import pipeline

my_pipeline = pipeline( "text-classification", model="distilbert-base-uncased-finetuned-sst-2-english" )
print(my_pipeline("Wi-Fi is slower than a snail today!"))
[{'label': 'NEGATIVE', 'score': 0.99}]
Working with Hugging Face

Text classification: Grammatical correctness

$$

Grammar check

$$

  • Evaluates text grammar for correctness

$$

Example of grammatical correctness

$$

  • Applications: Grammar checkers, language learning tools
Working with Hugging Face

Grammatical correctness: coding example

from transformers import pipeline


# Create a pipeline for grammar checking grammar_checker = pipeline( task="text-classification", model="abdulmatinomotoso/English_Grammar_Checker" )
# Check grammar of the input text print(grammar_checker("He eat pizza every day."))
[{'label': 'LABEL_0', 'score': 0.99}]
Working with Hugging Face

Text classification: QNLI

$$ Q&A

$$

$$

  • Checks if a premise answers a question

  • Applications: Q&A systems, fact-checking

Example of QNLI

Working with Hugging Face

QNLI: coding example

from transformers import pipeline


classifier = pipeline( task="text-classification", model="cross-encoder/qnli-electra-base" )
classifier("Where is Seattle located?, Seattle is located in Washington state.")
[{'label': 'LABEL_0', 'score': 0.997}]
Working with Hugging Face

Text classification: Dynamic category assignment

$$

  • Dynamically assigns categories based on content

Category assignment example

  • Applications: Content moderation, recommendation systems

$$

Category Assignment

Working with Hugging Face

Dynamic category assignment: coding example

classifier = pipeline(
  task="zero-shot-classification", 
  model="facebook/bart-large-mnli")


text = "Hey, DataCamp; we would like to feature your courses in our newsletter!" categories = ["marketing", "sales", "support"]
output = classifier(text, categories)
print(f"Top Label: {output['labels'][0]} with score: {output['scores'][0]}")
Top Label: support with score: 0.8183
Working with Hugging Face

Challenges of text classification

Ambiguity

Working with Hugging Face

Challenges of text classification

Sarcasm

Working with Hugging Face

Challenges of text classification

Multilingual

Working with Hugging Face

Let's practice!

Working with Hugging Face

Preparing Video For Download...