GRU and LSTM cells

Recurrent Neural Networks (RNNs) for Language Modeling with Keras

David Cecchini

Data Scientist

Zoom in a SimpleRNN memory cell, showing the memory state as input obtained in the previous cell and the input of the next word. The cell output the new memory state and the prediction

Recurrent Neural Networks (RNNs) for Language Modeling with Keras

Zoom in the GRU cell, showing the added structures called update gate and the initial candidate for memory cell

Recurrent Neural Networks (RNNs) for Language Modeling with Keras

Zoom in the LSTM cell, showing the added forget gate, output gate, update gate and the candidate for memory cell

Recurrent Neural Networks (RNNs) for Language Modeling with Keras

No more vanishing gradients

  • The simpleRNN cell can have gradient problems.

    • The weight matrix power t multiplies the other terms
  • GRU and LSTM cells don't have vanishing gradient problems

    • Because of their gates
    • Don't have the weight matrices terms multiplying the rest
    • Exploding gradient problems are easier to solve
Recurrent Neural Networks (RNNs) for Language Modeling with Keras

Usage in keras

# Import the layers
from tensorflow import keras
from tensorflow.keras.layers import GRU, LSTM
# Add the layers to a model
model.add(GRU(units=128, return_sequences=True, name='GRU layer'))
model.add(LSTM(units=64, return_sequences=False, name='LSTM layer'))
Recurrent Neural Networks (RNNs) for Language Modeling with Keras

Let's practice!

Recurrent Neural Networks (RNNs) for Language Modeling with Keras

Preparing Video For Download...