Convolutional layers for images

Deep Learning for Images with PyTorch

Michal Oleszak

Machine Learning Engineer

Convolutional layers for images

  • Apply convolutional layers to image data
  • Access and add convolutional layers
  • Create convolutional blocks

 

  • Used to adapt models to a specific task

Box around a cat

Deep Learning for Images with PyTorch

Conv2d: input channels

RGB channels

  • Grayscale image: in_channels=1
  • RGB image (red, green, blue): in_channels=3
  • Transparency includes alpha channel: in_channels=4
from torchvision.transforms import functional
image = PIL.Image.open("dog.png")
num_channels = functional.get_image_num_channels(image)
print("Number of channels: ", num_channels)
Number of channels: 3
Deep Learning for Images with PyTorch

Conv2d: kernel

filters

                                        Input tensor               Kernel       Output tensor (feature map)

  • Kernel (colored in green) moves from left to right, top to bottom of the image$^1$
1 Thevenot, Axel. 2020. A visual and mathematical explanation of the 2D convolution layer.
Deep Learning for Images with PyTorch

Kernel sizes

matrix calculation

  • The most common kernel sizes: 3x3 (Conv2d) and 2x2 (MaxPool2d)
  • Convolution is a dot product of the kernel (green) and the image region (pink)
  • The sum of the dot product creates a feature map (blue)
Deep Learning for Images with PyTorch

Kernel is a filter

  • Capture image patterns

area filter

line filter

Deep Learning for Images with PyTorch

Conv2d: output channels

output channels                                             Input channel      Kernel filters     Output channels

  • The number of output channels determines how many filters are applied
  • Each output channel corresponds to a distinct filter
  • A higher number of output channels allows the layer to learn more complex features
  • Output channel numbers are commonly chosen as powers of 2 (16, 32, 64, 128)
    • It simplifies the process of combining and dividing channels in subsequent layers
Deep Learning for Images with PyTorch

Adding convolutional layers

import torch
import torch.nn as nn

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1)
model = Net()

model.add_module('conv2', conv2)
Deep Learning for Images with PyTorch

Accessing convolutional layers

print(model)
Net(
  (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
model.conv2
Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
Deep Learning for Images with PyTorch

Creating convolutional blocks

  • Stacking convolutional layers in a block with nn.Sequential()
class BinaryImageClassification(nn.Module):
    def __init__(self):
        super(BinaryImageClassification, self).__init__()

self.conv_block = nn.Sequential( nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1), nn.ReLU(), nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1), nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2) )
def forward(self, x): x = self.conv_block(x)
Deep Learning for Images with PyTorch

Let's practice!

Deep Learning for Images with PyTorch

Preparing Video For Download...