Binary and multi-class image classification

Deep Learning for Images with PyTorch

Michal Oleszak

Machine Learning Engineer

What will we learn with PyTorch?

tasks

Deep Learning for Images with PyTorch

What will we learn with PyTorch?

tasks

Deep Learning for Images with PyTorch

What will we learn with PyTorch?

tasks

Deep Learning for Images with PyTorch

What will we learn with PyTorch?

tasks

Deep Learning for Images with PyTorch

Prerequisites

Deep Learning for Images with PyTorch

PyTorch library

torchvision

Deep Learning for Images with PyTorch

PyTorch library

torchvision

Deep Learning for Images with PyTorch

PyTorch library

torchvision

Deep Learning for Images with PyTorch

PyTorch library

torchvision

Deep Learning for Images with PyTorch

Image classification

             Binary classification

binary

  • Two distinct classes (cats, dogs)
  • Activation function: Sigmoid

             Multi-class classification

multi-class

  • Multiple classes (boat, train, car)
  • Activation function: Softmax
  • Highest probability is the prediction
Deep Learning for Images with PyTorch

Convolutional Neural Network model

  cnn

Deep Learning for Images with PyTorch

Convolutional Neural Network model

  binary

Deep Learning for Images with PyTorch

Convolutional Neural Network model

  binary

Deep Learning for Images with PyTorch

Convolutional Neural Network model

  binary

Deep Learning for Images with PyTorch

Convolutional Neural Network model

  binary

Deep Learning for Images with PyTorch

Datasets: class labels

image folders

from torchvision import datasets
import torchvision.transforms as transforms


train_dir = '/data/train' train_dataset = ImageFolder(root=train_dir, transform=transforms.ToTensor())
classes = train_dataset.classes
print(classes)
['cat', 'dog']
print(train_dataset.class_to_idx)
{'cat': 0, 'dog': 1}
Deep Learning for Images with PyTorch

Binary image classification: convolutional layer

  • Conv2d():
    • Input: 3 RGB channels (red, green, blue)
    • Output: 16 channels
    • Kernel: 3 x 3 matrix
    • Stride = 1: the kernel moves 1 step
    • Padding = 1: 1 pixel around the border
  • ReLU():
    • A non-linear activation function
  • MaxPool2d():
    • Kernel: 2x2
    • Stride: 2 steps
class BinaryCNN(nn.Module):
    def __init__(self):
        super(BinaryCNN, self).__init__()

self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1)
self.relu = nn.ReLU()
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
def forward(self, x): return x
Deep Learning for Images with PyTorch

Binary image classification: fully connected layer

  • Flatten():
    • Tensors flattened into 1-D vector
  • Linear():
    • Input: feature maps x height x width
    • Output: a single class
  • Sigmoid():
    • [0,1]
class BinaryCNN(nn.Module):
    def __init__(self):
        super(BinaryCNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 16,
                kernel_size=3, stride=1, padding=1)
        self.relu = nn.ReLU()
        self.pool = nn.MaxPool2d(kernel_size=2, 
                     stride=2)

self.flatten = nn.Flatten()
self.fc1 = nn.Linear(16 * 112 * 112, 1)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
x = self.pool(self.relu(self.conv1(x))) x = self.fc1(self.flatten(x)) x = self.sigmoid(x)]
return x
Deep Learning for Images with PyTorch

Multi-class image classification with CNN

class MultiClassCNN(nn.Module):
    def __init__(self, num_classes):
        super(MultiClassCNN, self).__init__()
        ...

self.fc = nn.Linear(16 * 112 * 112, num_classes)
self.softmax = nn.Softmax(dim=1)
def forward(self, x): ...
x = self.softmax(x)
return x
Deep Learning for Images with PyTorch

Let's practice!

Deep Learning for Images with PyTorch

Preparing Video For Download...