Text classification

Working with Hugging Face

Jacob H. Marquez

Lead Data Engineer

Text classification: Sentiment analysis

Labels text based on its emotional tone

Sentiment analysis

Applications: Analyzing reviews, tracking social media sentiment

Sentiment icon

Sentiment analysis: coding example

from transformers import pipeline


my_pipeline = pipeline(
    "text-classification",
    model="distilbert-base-uncased-finetuned-sst-2-english"
)

print(my_pipeline("Wi-Fi is slower than a snail today!"))

[{'label': 'NEGATIVE', 'score': 0.99}]

Text classification: Grammatical correctness

Grammar check

Evaluates text grammar for correctness

Example of grammatical correctness

Applications: Grammar checkers, language learning tools

Grammatical correctness: coding example

from transformers import pipeline


# Create a pipeline for grammar checking
grammar_checker = pipeline(
    task="text-classification",
    model="abdulmatinomotoso/English_Grammar_Checker"
)

# Check grammar of the input text
print(grammar_checker("He eat pizza every day."))

[{'label': 'LABEL_0', 'score': 0.99}]

Text classification: QNLI

$$ Q&A

Checks if a premise answers a question
Applications: Q&A systems, fact-checking

Example of QNLI

QNLI: coding example

from transformers import pipeline


classifier = pipeline(
    task="text-classification",
    model="cross-encoder/qnli-electra-base"
)


classifier("Where is Seattle located?, Seattle is located in Washington state.")

[{'label': 'LABEL_0', 'score': 0.997}]

Text classification: Dynamic category assignment

Dynamically assigns categories based on content

Category assignment example

Applications: Content moderation, recommendation systems

Category Assignment

Dynamic category assignment: coding example

classifier = pipeline(
  task="zero-shot-classification", 
  model="facebook/bart-large-mnli")


text = "Hey, DataCamp; we would like to feature your courses in our newsletter!"
categories = ["marketing", "sales", "support"]


output = classifier(text, categories)


print(f"Top Label: {output['labels'][0]} with score: {output['scores'][0]}")

Top Label: support with score: 0.8183

Challenges of text classification

Ambiguity

Challenges of text classification

Sarcasm

Challenges of text classification

Multilingual

Let's practice!

Working with Hugging Face