Handling images with PyTorch

Intermediate Deep Learning with PyTorch

Michal Oleszak

Machine Learning Engineer

Clouds dataset

Samples from the clouds dataset: five images showing different types of clouds.

1 https://www.kaggle.com/competitions/cloud-type-classification2/data
Intermediate Deep Learning with PyTorch

What is an image?

A cloud image with a part of it zoomed-in so that pixels are visible.

  • Image consists of pixels ("picture elements")
  • Each pixel contains color information

  • Grayscale images: integer in 0 - 255

    • 30:

Gray colored box

  • Color images: three integers, one for each color channel (Red, Green, Blue)
    • RGB = (52, 171, 235):

Blue colored box

Intermediate Deep Learning with PyTorch

Loading images to PyTorch

Desired directory structure:

clouds_train

- cumulus
- 75cbf18.jpg - ...
- cumulonimbus - ...
clouds_test
- cumulus - cumulonimbus - ...

 

  • Main folders: clouds_train and clouds_test
  • Inside each main folder: one folder per category
  • Inside each class folder: image files
Intermediate Deep Learning with PyTorch

Loading images to PyTorch

from torchvision.datasets import ImageFolder
from torchvision import transforms


train_transforms = transforms.Compose([ transforms.ToTensor(), transforms.Resize((128, 128)), ])
dataset_train = ImageFolder( "data/clouds_train", transform=train_transforms, )
  • Define transformations:

    • Parse to tensor
    • Resize to 128x128
  • Create dataset passing:

    • Path to data
    • Predefined transformations
Intermediate Deep Learning with PyTorch

Displaying images

dataloader_train = DataLoader(
    dataset_train, 
    shuffle=True, 
    batch_size=1,
)

image, label = next(iter(dataloader_train))
print(image.shape)
torch.Size([1, 3, 128, 128])
image = image.squeeze().permute(1, 2, 0)
print(image.shape)
torch.Size([128, 128, 3])
import matplotlib.pyplot as plt
plt.imshow(image)
plt.show()

Cloud image output of plt.show

Intermediate Deep Learning with PyTorch

Data augmentation

train_transforms = transforms.Compose([

transforms.RandomHorizontalFlip(), transforms.RandomRotation(45),
transforms.ToTensor(), transforms.Resize((128, 128)), ])
dataset_train = ImageFolder( "data/clouds/train", transform=train_transforms, )

Data augmentation: Generating more data by applying random transformations to original images

  • Increase the size and diversity of the training set
  • Improve model robustness
  • Reduce overfitting

Three images of clouds showing the rotation transformation

Intermediate Deep Learning with PyTorch

Let's practice!

Intermediate Deep Learning with PyTorch

Preparing Video For Download...