Transformers for text processing

Deep Learning for Text with PyTorch

Shubham Jain

Instructor

Why use transformers for text processing?

Transformers Logo

  • Speed
  • Understand the relationship between words, regardless of distances
  • Human-like response
Deep Learning for Text with PyTorch

Components of a transformer

  • Encoder: Processes input data

 

  • Decoder: Reconstructs the output

 

  • Feed-forward Neural Networks: Refine understanding

 

  • Positional Encoding: Ensure order matters

 

  • Multi-Head Attention: Captures multiple inputs or sentiments
Deep Learning for Text with PyTorch

Preparing our data: train-test split

sentences = ["I love this product", "This is terrible", 
             "Could be better", "This is the best"]
labels = [1, 0, 0, 1]

train_sentences = sentences[:3] train_labels = labels[:3] test_sentences = sentences[3:] test_labels = labels[3:]
Deep Learning for Text with PyTorch

Building the transformer model

class TransformerEncoder(nn.Module):

def __init__(self, embed_size, heads, num_layers, dropout): super(TransformerEncoder, self).__init__()
self.encoder = nn.TransformerEncoder( nn.TransformerEncoderLayer(d_model=embed_size, nhead=heads), num_layers=num_layers)
self.fc = nn.Linear(embed_size, 2)
def forward(self, x):
x = self.encoder(x)
x = x.mean(dim=1)
return self.fc(x)
model = TransformerEncoder(embed_size=512, heads=8, num_layers=3, dropout=0.5)
optimizer = optim.Adam(model.parameters(), lr=0.001) criterion = nn.CrossEntropyLoss()
Deep Learning for Text with PyTorch

Training the transformers

for epoch in range(5):

for sentence, label in zip(train_sentences, train_labels): tokens = sentence.split()
data = torch.stack([token_embeddings[token] for token in tokens], dim=1)
output = model(data)
loss = criterion(output, torch.tensor([label]))
optimizer.zero_grad() loss.backward() optimizer.step() print(f"Epoch {epoch}, Loss: {loss.item()}")
Epoch 0, Loss: 13.788233757019043
Epoch 1, Loss: 3.9480819702148438
Epoch 2, Loss: 2.4790847301483154
Epoch 3, Loss: 1.3020926713943481
Epoch 4, Loss: 0.4660853147506714
Deep Learning for Text with PyTorch

Predicting the transformers

def predict(sentence):
    model.eval()

with torch.no_grad():
tokens = sentence.split() data = torch.stack([token_embeddings.get(token, torch.rand((1, 512))) for token in tokens], dim=1)
output = model(data)
predicted = torch.argmax(output, dim=1)
return "Positive" if predicted.item() == 1 else "Negative"
Deep Learning for Text with PyTorch

Predicting on new text

sample_sentence = "This product can be better"
print(f"'{sample_sentence}' is {predict(sample_sentence)}")
'This product can be better' is Negative
Deep Learning for Text with PyTorch

Let's practice!

Deep Learning for Text with PyTorch

Preparing Video For Download...