Trabalhando com Hugging Face
Jacob H. Marquez
Lead Data Engineer
from transformers import pipeline
my_pipeline = pipeline(
"text-classification",
model="distilbert-base-uncased-finetuned-sst-2-english"))
print(my_pipeline("Wi-Fi is slower than a snail today!"))
[{'label': 'NEGATIVE', 'score': 0.99}]
$$
$$

$$
from transformers import AutoModelForSequenceClassification# Baixar um modelo pré-treinado de classificação de texto model = AutoModelForSequenceClassification.from_pretrained( "distilbert-base-uncased-finetuned-sst-2-english" )
$$
from transformers import AutoTokenizer# Recuperar o tokenizer pareado com o modelo tokenizer = AutoTokenizer.from_pretrained( "distilbert-base-uncased-finetuned-sst-2-english" )
$$
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")# Tokenize input text tokens = tokenizer.tokenize("AI: Helping robots think and humans overthink:)") print(tokens)
['ai', ':', 'helping', 'robots', 'think', 'and',
'humans', 'over', '##thi', '##nk', ':', ')']
Nosso modelo (distilbert-base-uncased):
['ai', ':', 'helping', 'robots', 'think', 'and', 'humans', 'over', '##thi',
'##nk', ':', ')']
Tokenizer BERT-Base-Cased:
['AI', ':', 'Help', '##ing', 'robots', 'think', 'and', 'humans', 'over',
'##thin', '##k', ':', ')']
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline# Baixar o modelo e o tokenizer my_model = AutoModelForSequenceClassification.from_pretrained( "distilbert-base-uncased-finetuned-sst-2-english") my_tokenizer = AutoTokenizer.from_pretrained( "distilbert-base-uncased-finetuned-sst-2-english")# Criar o pipeline customizado my_pipeline = pipeline( task="sentiment-analysis", model=my_model, tokenizer=my_tokenizer)
$$
🔧 Use para mais controle e personalização
📝 Pré-processamento de texto: Limpe e tokenize para casos específicos
$$

Trabalhando com Hugging Face