Multi-output models

Intermediate Deep Learning with PyTorch

Michal Oleszak

Machine Learning Engineer

Why multi-output?

Multi-task learning Model schema: image of a car as input, make and model as two outputs.

Multi-label classification Model schema: single image as input, multiple predictions as outputs.

Regularization Model schema: multiple blocks of layers, after each, an output is predicted.

Character and alphabet classification

Model schema: character image is passed to a neural network.

Character and alphabet classification

Model schema: two classifier classify character and alphabet from the image embedding.

Two-output Dataset

class OmniglotDataset(Dataset):
    def __init__(self, transform, samples):
        self.transform = transform
        self.samples = samples

    def __len__(self):
        return len(self.samples)

    def __getitem__(self, idx):
        img_path, alphabet, label = \
            self.samples[idx]
        img = Image.open(img_path).convert('L')
        img = self.transform(img)
        return img, alphabet, label

We can use the same Dataset...
...with updated samples:

  print(samples[0])

  [(
    'omniglot_train/.../0459_14.png',
     0,
     0,
   )]

Two-output architecture

class Net(nn.Module):
    def __init__(self, num_alpha, num_char):
        super().__init__()
        self.image_layer = nn.Sequential(
            nn.Conv2d(1, 16, kernel_size=3, padding=1),
            nn.MaxPool2d(kernel_size=2),
            nn.ELU(),
            nn.Flatten(),
            nn.Linear(16*32*32, 128)
        )

        self.classifier_alpha = nn.Linear(128, 30)
        self.classifier_char = nn.Linear(128, 964)


    def forward(self, x):
        x_image = self.image_layer(x)

        output_alpha = self.classifier_alpha(x_image)
        output_char = self.classifier_char(x_image)

        return output_alpha, output_char

Define image-processing sub-network
Define output-specific classifiers
Pass image through dedicated sub-network
Pass the result through each output layer
Return both outputs

Training loop

for epoch in range(10):
    for images, labels_alpha, labels_char \
    in dataloader_train:
        optimizer.zero_grad()
        outputs_alpha, outputs_char = net(images)

        loss_alpha = criterion(
          outputs_alpha, labels_alpha
        )
        loss_char = criterion(
          outputs_char, labels_char
        )

        loss = loss_alpha + loss_char

        loss.backward()
        optimizer.step()

Model produces two outputs
Calculate loss for each output
Combine the losses to one total loss
Backprop and optimize with the total loss

Let's practice!

Intermediate Deep Learning with PyTorch