Machine Translation with Keras
Thushan Ganegedara
Data Scientist and Author


At time step 1, the GRU layer,

At time step 2, the GRU layer,
The hidden state represents "memory" of what the model has seen

Layer and Model objects.inp = keras.layers.Input(shape=(...))layer = keras.layers.GRU(...)out = layer(inp)model = Model(inputs=inp, outputs=out)
Defining Keras layers
inp = keras.layers.Input(batch_shape=(2,3,4))
gru_out = keras.layers.GRU(10)(inp)
Defining a Keras model
model = keras.models.Model(inputs=inp, outputs=gru_out)
Predicting with the Keras model
x = np.random.normal(size=(2,3,4))
y = model.predict(x)
print("shape (y) =", y.shape, "\ny = \n", y)
shape (y) = (2, 10)
y =
[[ 0.2576233 0.01215531 ... -0.32517594 0.4483121 ],
[ 0.54189587 -0.63834655 ... -0.4339783 0.4043917 ]]
A GRU that takes arbitrary number of samples in a batch
inp = keras.layers.Input(shape=(3,4))
gru_out = keras.layers.GRU(10)(inp)
model = keras.models.Model(inputs=inp, outputs=gru_out)
x = np.random.normal(size=(5,3,4))
y = model.predict(x)
print("y = \n", y)
y =
[[-1.3941444e-02 -3.3123985e-02 ... 6.5081201e-02 1.1245312e-01]
[ 1.1409521e-03 3.6983326e-01 ... -3.4610277e-01 -3.4792548e-01]
[ 2.5911796e-01 -3.9517123e-01 ... 5.8505309e-01 3.6908010e-01]
[-2.8727052e-01 -5.1150680e-02 ... -1.9637148e-01 -1.5587148e-01]
[ 3.1303680e-01 2.3338445e-01 ... 9.1499090e-04 -2.0590121e-01]]
inp = keras.layers.Input(batch_shape=(2,3,4))
gru_out2, gru_state = keras.layers.GRU(10, return_state=True)(inp)
print("gru_out2.shape = ", gru_out2.shape)
print("gru_state.shape = ", gru_state.shape)
gru_out2.shape = (2, 10)
gru_state.shape = (2, 10)

inp = keras.layers.Input(batch_shape=(2,3,4))
gru_out3 = keras.layers.GRU(10, return_sequences=True)(inp)
print("gru_out3.shape = ", gru_out2.shape)
gru_out3.shape = (2, 3, 10)

Machine Translation with Keras