Introduction to text generation

Deep Learning for Text with PyTorch

Shubham Jain

Instructor

Text generation and NLP

Key applications: chatbots, language translation, technical writing
RNN, LSTM, GRU: remembering past information for better sequential data processing
Input: The cat is on the m
Output: The cat is on the mat

Chatbot Image

¹ Image by vectorjuice on Freepik

Building an RNN for text generation

import torch
import torch.nn as nn

data = "Hello how are you?"
chars = list(set(data))  
char_to_ix = {char: i for i, char in enumerate(chars)}
ix_to_char = {i: char for i, char in enumerate(chars)}

class RNNModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNNModel, self).__init__()
        self.hidden_size = hidden_size

        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)

        self.fc = nn.Linear(hidden_size, output_size)

Forward propagation and model creation

    def forward(self, x):

        h0 = torch.zeros(1, x.size(0), self.hidden_size)

        out, _ = self.rnn(x, h0)

        out = self.fc(out[:, -1, :])

        return out


model = RNNmodel(1, 16, 1)


criterion = nn.CrossEntropyLoss()

optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

Preparing input and target data

inputs = [char_to_ix[ch] for ch in data[:-1]]
targets = [char_to_ix[ch] for ch in data[1:]]



inputs = torch.tensor(inputs, dtype=torch.long)
              .view(-1, 1)



inputs = nn.functional.one_hot(
       inputs, num_classes=len(chars)).float()



targets = torch.tensor(targets, dtype=torch.long)

Creating indexes
Tensor conversion
One-Hot encoding
Targets preparation

Training the RNN model

for epoch in range(100):

    model.train()

    outputs = model(inputs)

    loss = criterion(outputs, targets)

    optimizer.zero_grad()

    loss.backward()

    optimizer.step()


    if (epoch+1) % 10 == 0:
        print(f'Epoch {epoch+1}/100, Loss: {loss.item()}')

Testing the model

model.eval()

test_input = char_to_ix['h']

test_input = nn.functional.one_hot(torch.tensor(test_input)
                               .view(-1, 1), num_classes=len(chars)).float()

predicted_output = model(test_input)
predicted_char_ix = torch.argmax(predicted_output, 1).item()

print(f'Test Input: 10, Predicted Output: {model(test_input).item()}')

Epoch 10/100, Loss: 3090.861572265625
Epoch 20/100, Loss: 2935.4580078125
...
Epoch 100/100, Loss: 1922.44140625

Test Input: h, Predicted Output: e

Let's practice!

Deep Learning for Text with PyTorch