Layer initialization and transfer learning

Introduction to Deep Learning with PyTorch

Jasmin Ludolf

Senior Data Science Content Developer, DataCamp

Layer initialization

import torch.nn as nn

layer = nn.Linear(64, 128)
print(layer.weight.min(), layer.weight.max())
(tensor(-0.1250, grad_fn=<MinBackward1>), tensor(0.1250, grad_fn=<MaxBackward1>))

$$

  • A layer weights are initialized to small values
  • Keeping both the input data and layer weights small ensures stable outputs
Introduction to Deep Learning with PyTorch

Layer initialization

import torch.nn as nn

layer = nn.Linear(64, 128)
nn.init.uniform_(layer.weight)

print(layer.weight.min(), layer.weight.max())
(tensor(0.0002, grad_fn=<MinBackward1>), tensor(1.0000, grad_fn=<MaxBackward1>))
Introduction to Deep Learning with PyTorch

Transfer learning

  • Reusing a model trained on a first task for a second similar task
    • Trained a model on US data scientist salaries
    • Use weights to train on European salaries

$$

import torch

layer = nn.Linear(64, 128)
torch.save(layer, 'layer.pth')

new_layer = torch.load('layer.pth')
Introduction to Deep Learning with PyTorch

Fine-tuning

  • A type of transfer learning
    • Smaller learning rate
    • Train part of the network (we freeze some of them)
    • Rule of thumb: freeze early layers of network and fine-tune layers closer to output layer
import torch.nn as nn

model = nn.Sequential(nn.Linear(64, 128),
                      nn.Linear(128, 256))

for name, param in model.named_parameters():
    if name == '0.weight':
        param.requires_grad = False
Introduction to Deep Learning with PyTorch

Let's practice!

Introduction to Deep Learning with PyTorch

Preparing Video For Download...