Understanding sequential models

Machine Translation with Keras

Thushan Ganegedara

Data Scientist and Author

Time series inputs and sequential models

  • A sentence is a time series input
    • Current word is affected by previous words
    • E.g. He went to the pool for a ....
  • The encoder/decoder uses a machine learning model
    • Models that can learn from time-series inputs
    • Models are called sequential models
Machine Translation with Keras

Sequential models

  • Sequential models
    • Moves through the input while producing an output at each time step

Sequential model architecture

Machine Translation with Keras

Encoder as a sequential model

  • GRU - Gated Recurrent Unit

Gated recurrent units

Machine Translation with Keras

Introduction to the GRU layer

At time step 1, the GRU layer,

  • Consumes the input "We"
  • Consumes the initial state (0,0)
  • Outputs the new state (0.8, 0.3)

GRU 1

Machine Translation with Keras

Introduction to GRU layer

At time step 2, the GRU layer,

  • Consumes the input "like"
  • Consumes the initial state (0.8,0.3)
  • Outputs the new state (0.5, 0.9)

The hidden state represents "memory" of what the model has seen

GRU 2

Machine Translation with Keras

Keras (Functional API) refresher

  • Keras has two important objects: Layer and Model objects.
  • Input layer
    • inp = keras.layers.Input(shape=(...))
  • Hidden layer
    • layer = keras.layers.GRU(...)
  • Output
    • out = layer(inp)
  • Model
    • model = Model(inputs=inp, outputs=out)
Machine Translation with Keras

Understanding the shape of the data

  • Sequential data is 3-dimensional
    • Batch dimension (e.g. batch = groups of sentences)
    • Time dimension - sequence length
    • Input dimension (e.g. onehot vector length)
  • GRU model input shape
    • (Batch, Time, Input)
    • (batch size, sequence length, onehot length)

Input data

Machine Translation with Keras

Implementing GRUs with Keras

Defining Keras layers

inp = keras.layers.Input(batch_shape=(2,3,4))
gru_out = keras.layers.GRU(10)(inp)

Defining a Keras model

model = keras.models.Model(inputs=inp, outputs=gru_out)
Machine Translation with Keras

Implementing GRUs with Keras

Predicting with the Keras model

x = np.random.normal(size=(2,3,4))
y = model.predict(x)
print("shape (y) =", y.shape, "\ny = \n", y)
shape (y) = (2, 10) 
y = 
[[ 0.2576233   0.01215531  ... -0.32517594  0.4483121 ],
 [ 0.54189587 -0.63834655  ... -0.4339783   0.4043917 ]]
Machine Translation with Keras

Implementing GRUs with Keras

A GRU that takes arbitrary number of samples in a batch

inp = keras.layers.Input(shape=(3,4))
gru_out = keras.layers.GRU(10)(inp)
model = keras.models.Model(inputs=inp, outputs=gru_out)
x = np.random.normal(size=(5,3,4))
y = model.predict(x)
print("y = \n", y)
y = 
 [[-1.3941444e-02 -3.3123985e-02 ... 6.5081201e-02  1.1245312e-01]
 [ 1.1409521e-03  3.6983326e-01 ... -3.4610277e-01 -3.4792548e-01]
 [ 2.5911796e-01 -3.9517123e-01 ... 5.8505309e-01  3.6908010e-01]
 [-2.8727052e-01 -5.1150680e-02 ... -1.9637148e-01 -1.5587148e-01]
 [ 3.1303680e-01  2.3338445e-01 ... 9.1499090e-04 -2.0590121e-01]]
Machine Translation with Keras

GRU layer's return_state argument

inp = keras.layers.Input(batch_shape=(2,3,4))
gru_out2, gru_state = keras.layers.GRU(10, return_state=True)(inp)
print("gru_out2.shape = ", gru_out2.shape)
print("gru_state.shape = ", gru_state.shape)
gru_out2.shape =  (2, 10)
gru_state.shape =  (2, 10)

GRU return_state

Machine Translation with Keras

GRU layer's return_sequences argument

inp = keras.layers.Input(batch_shape=(2,3,4))
gru_out3 = keras.layers.GRU(10, return_sequences=True)(inp)
print("gru_out3.shape = ", gru_out2.shape)
gru_out3.shape =  (2, 3, 10)

GRU return_sequences

Machine Translation with Keras

Let's practice!

Machine Translation with Keras

Preparing Video For Download...