Hugging Face pipelines for sentiment analysis

Natural Language Processing (NLP) in Python

Fouad Trad

Machine Learning Engineer

Recap: NLP workflow

The full workflow diagram mentioning that Chapter 1 covered preprocessing and Chapter 2 covered feature extraction.

Hugging Face pipelines

The full workflow diagram mentioning that Chapters 3 and 4 will cover hugging face pipelines which wrap all steps: preprocessing, feature extraction, and modeling.

Ready-made workflow that handles all steps in a function call
Defining a pipeline requires:
- NLP task
- Model to perform the task

Pipelines for sentiment analysis

Text classification task
Predicts if text expresses positive or negative emotion

Image showing a happy face with a thumbs up for a positive sentiment, and a sad face with a thumbs down for a negative sentiment.

Models for text classification

A gif showing how to scroll through the website and select the right task to get suitable models.

¹ https://huggingface.co/models

Pipelines in code

from transformers import pipeline


classification_pipeline = pipeline(

  task="sentiment-analysis", # or text-classification

  model="distilbert/distilbert-base-uncased-finetuned-sst-2-english" 
)

result = classification_pipeline("I really liked the movie!!")

print(result)

[{'label': 'POSITIVE', 'score': 0.9998093247413635}]

Sentiment analysis on a batch of texts

texts = ["I really liked the movie!!",
         "Great job ruining my day.",
         "This product exceeded my expectations.",
         "Wow, just what I needed... another problem.", 
         "Absolutely fantastic experience!"]

results = classification_pipeline(texts)

print(results)

[{'label': 'POSITIVE', 'score': 0.9998093247413635}, 
 {'label': 'NEGATIVE', 'score': 0.8666700124740601}, 
 {'label': 'POSITIVE', 'score': 0.998874843120575}, 
 {'label': 'POSITIVE', 'score': 0.98626708984375}, 
 {'label': 'POSITIVE', 'score': 0.9998812675476074}]

Assessing sentiment analysis models

texts = ["I really liked the movie!!",
         "Great job ruining my day.",
         "This product exceeded my expectations.",
         "Wow, just what I needed... another problem.", 
         "Absolutely fantastic experience!"]

true_labels = ["POSITIVE", "NEGATIVE", "POSITIVE", "NEGATIVE", "POSITIVE"]

results = classification_pipeline(texts)

predicted_labels = [result['label'] for result in results]

from sklearn.metrics import accuracy_score
accuracy = accuracy_score(true_labels, predicted_labels)

print(f"Accuracy: {accuracy}")

Accuracy: 0.80

Let's practice!

Natural Language Processing (NLP) in Python