Hugging Face-pipelines voor sentimentanalyse

Natural Language Processing (NLP) in Python

Fouad Trad

Machine Learning Engineer

Herhaling: NLP-werkstroom

Het volledige workflowdiagram dat aangeeft dat Hoofdstuk 1 voorbewerking en Hoofdstuk 2 feature-extractie behandelde.

Hugging Face-pipelines

Het volledige workflowdiagram dat aangeeft dat Hoofdstukken 3 en 4 Hugging Face-pipelines behandelen die alle stappen omvatten: voorbewerking, feature-extractie en modellering.

Kant-en-klare workflow die alle stappen in één functieaanroep afhandelt
Een pipeline definiëren vereist:
- NLP-taak
- Model voor die taak

Pipelines voor sentimentanalyse

Tekstclassificatietaak
Voorspelt of tekst positief of negatief is

Afbeelding met een blij gezicht met duim omhoog voor positief sentiment, en een verdrietig gezicht met duim omlaag voor negatief sentiment.

Modellen voor tekstclassificatie

Een gif die laat zien hoe je door de website scrolt en de juiste taak selecteert om geschikte modellen te vinden.

¹ https://huggingface.co/models

Pipelines in code

from transformers import pipeline


classification_pipeline = pipeline(

  task="sentiment-analysis", # or text-classification

  model="distilbert/distilbert-base-uncased-finetuned-sst-2-english" 
)

result = classification_pipeline("I really liked the movie!!")

print(result)

[{'label': 'POSITIVE', 'score': 0.9998093247413635}]

Sentimentanalyse op een batch teksten

texts = ["I really liked the movie!!",
         "Great job ruining my day.",
         "This product exceeded my expectations.",
         "Wow, just what I needed... another problem.", 
         "Absolutely fantastic experience!"]

results = classification_pipeline(texts)

print(results)

[{'label': 'POSITIVE', 'score': 0.9998093247413635}, 
 {'label': 'NEGATIVE', 'score': 0.8666700124740601}, 
 {'label': 'POSITIVE', 'score': 0.998874843120575}, 
 {'label': 'POSITIVE', 'score': 0.98626708984375}, 
 {'label': 'POSITIVE', 'score': 0.9998812675476074}]

Sentimentmodellen beoordelen

texts = ["I really liked the movie!!",
         "Great job ruining my day.",
         "This product exceeded my expectations.",
         "Wow, just what I needed... another problem.", 
         "Absolutely fantastic experience!"]

true_labels = ["POSITIVE", "NEGATIVE", "POSITIVE", "NEGATIVE", "POSITIVE"]

results = classification_pipeline(texts)

predicted_labels = [result['label'] for result in results]

from sklearn.metrics import accuracy_score
accuracy = accuracy_score(true_labels, predicted_labels)

print(f"Accuracy: {accuracy}")

Accuracy: 0.80

Laten we oefenen!

Natural Language Processing (NLP) in Python