Convolutional neural networks for text classification

Deep Learning for Text with PyTorch

Shubham Jain

Instructor

CNNs for text classification

Classifying tweets as
- Positive
- Negative
- Neutral

The convolution operation

Convolution Operation

Convolution operation
- Sliding a filter (kernel) over the input data
- For each position of the filter, perform element-wise calculations

For text: learns structure and meaning of words

¹ Animation from Vincent Dumoulin, Francesco Visin

Filter and stride in CNNs

Filter:
- Small matrix that we slide over the input

Stride:
- Number of positions the filter moves

Filter and Stride

¹ Animation from Vincent Dumoulin, Francesco Visin

CNN architecture for text

Convolutional layer: applies filters to input data
Pooling layer: reduces data size while preserving important information
Fully connected layer: makes final predictions based on previous layer output

Implementing a text classification model using CNN

class SentimentAnalysisCNN(nn.Module):

    def __init__(self, vocab_size, embed_dim):

        super().__init__()

        self.embedding = nn.Embedding(vocab_size, 
                                         embed_dim)

        self.conv = nn.Conv1d(embed_dim, embed_dim, 
                               kernel_size=3, stride=1, 
                               padding=1)

        self.fc =  nn.Linear(embed_dim, 2)
    ...

__init__ method configures the architecture
super() initializes the base class nn.Module
nn.Embedding creates dense word vectors
nn.Conv1d for one dimensional data

Implementing a text classification model using CNN

    ...
    def forward(self, text):
        embedded = self.embedding(text).permute(0, 2, 1)

        conved = F.relu(self.conv(embedded))

        conved = conved.mean(dim=2)

        return self.fc(conved)

Embedding layer converts text to embedding
Match tensors to convolution layer's expected input
Extract important features with ReLU
Eliminate extra layers and dimensions

Preparing data for the sentiment analysis model

vocab = ["i", "love", "this", "book", "do", "not", "like"]
word_to_idx = {word: i for i, word in enumerate(vocab)}

vocab_size = len(word_to_ix)

embed_dim = 10

book_samples = [
    ("The story was captivating and kept me hooked until the end.".split(),1),
    ("I found the characters shallow and the plot predictable.".split(),0)
]

model = SentimentAnalysisCNN(vocab_size, embed_dim)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

Training the model

for epoch in range(10):  
    for sentence, label in data:

        model.zero_grad()

        sentence = torch.LongTensor([word_to_idx.get(w, 0) for w in sentence]).unsqueeze(0)

        outputs = model(sentence)
        label = torch.LongTensor([int(label)])

        loss = criterion(outputs, label)
        loss.backward()

        optimizer.step()

Running the Sentiment Analysis Model

for sample in book_samples:

    input_tensor = torch.tensor([word_to_idx[w] for w in sample], dtype=torch.long).unsqueeze(0)

    outputs = model(input_tensor)

    _, predicted_label = torch.max(outputs.data, 1)

    sentiment = "Positive" if predicted_label.item() == 1 else "Negative"

    print(f"Book Review: {' '.join(sample)}")
    print(f"Sentiment: {sentiment}\n")

Book Review: The story was captivating and kept me hooked until the end
Sentiment: Positive
Book Review: I found the characters shallow and the plot predictable
Sentiment: Negative

Let's practice!

Deep Learning for Text with PyTorch