Recurrent Neural Networks

Intermediate Deep Learning with PyTorch

Michal Oleszak

Machine Learning Engineer

Recurrent neuron

  • Feed-forward networks
  • RNNs: have connections pointing back
  • Recurrent neuron:
    • Input x
    • Output y
    • Hidden state h
  • In PyTorch: nn.RNN()

A schema of a plain RNN neuron: the neuron applying weights and activation receives input x and produces outputs y and h, where h is fed back into itself.

Intermediate Deep Learning with PyTorch

Unrolling recurrent neuron through time

Schema of the recurrent neuron. At time step 0, it receives inputs h0 and x0, and produces outputs y0 and h1.

Intermediate Deep Learning with PyTorch

Unrolling recurrent neuron through time

Schema of the recurrent neuron. At time step 1, it receives inputs h1 and x1, and produces output y1.

Intermediate Deep Learning with PyTorch

Unrolling recurrent neuron through time

Schema of the recurrent neuron. At time step 2, it receives inputs h2 and x2, and produces outputs y2 and h3.

Intermediate Deep Learning with PyTorch

Deep RNNs

Schema of two recurrent neurons forming a layer. At each time step, outputs y are passed to another neuron.

Intermediate Deep Learning with PyTorch

Sequence-to-sequence architecture

  • Pass sequence as input, use the entire output sequence
  • Example: Real-time speech recognition

Architecture schema: at each time step, there is a new input, while all outputs y produced at each time step are marked in green as used,

Intermediate Deep Learning with PyTorch

Sequence-to-vector architecture

  • Pass sequence as input, use only the last output
  • Example: Text topic classification

Architecture schema: at each time step, there is a new input, while only the last output y from the last time step is marked in green as used,

Intermediate Deep Learning with PyTorch

Vector-to-sequence architecture

  • Pass single input, use the entire output sequence
  • Example: Text generation

Architecture schema: there is only one input, at the first time step, while all outputs y produced at each time step are marked in green as used,

Intermediate Deep Learning with PyTorch

Encoder-decoder architecture

  • Pass entire input sequence, only then start using output sequence
  • Example: Machine translation

Architecture schema: in the first part (encoder), inputs are received at each time step but outputs are ignored; in the second part (decoder), no more inputs are received but all outputs from each time step are used.

Intermediate Deep Learning with PyTorch

RNN in PyTorch

class Net(nn.Module):
    def __init__(self):
        super().__init__()

self.rnn = nn.RNN( input_size=1, hidden_size=32, num_layers=2, batch_first=True, )
self.fc = nn.Linear(32, 1)
def forward(self, x): h0 = torch.zeros(2, x.size(0), 32)
out, _ = self.rnn(x, h0)
out = self.fc(out[:, -1, :]) return out
  • Define model class with __init__ method
  • Define recurrent layer, self.rnn
  • Define linear layer, fc
  • In forward(), initialize first hidden state to zeros
  • Pass input and first hidden state through RNN layer
  • Select last RNN's output and pass it through linear layer
Intermediate Deep Learning with PyTorch

Let's practice!

Intermediate Deep Learning with PyTorch

Preparing Video For Download...