Recurrent Neural Networks (RNNs) for Language Modeling with Keras
David Cecchini
Data Scientist
Many available models
softmax
function on the output layer of the networkLanguage models are everywhere in RNNs!
# Get unique words
unique_words = list(set(text.split(' ')))
# Create dictionary: word is key, index is value
word_to_index = {k:v for (v,k) in enumerate(unique_words)}
# Create dictionary: index is key, word is value
index_to_word = {k:v for (k,v) in enumerate(unique_words)}
# Initialize variables X and y X = [] y = []
# Loop over the text: length `sentence_size` per time with step equal to `step` for i in range(0, len(text) - sentence_size, step):
X.append(text[i:i + sentence_size]) y.append(text[i + sentence_size])
# Example (numbers are numerical indexes of vocabulary):
# Sentence is: "i loved this movie" -> (["i", "loved", "this"], "movie")
X[0],y[0] = ([10, 444, 11], 17)
# Create list to keep the sentences of indexes new_text_split = []
# Loop and get the indexes from dictionary for sentence in new_text:
sent_split = []
for wd in sentence.split(' '):
ix = wd_to_index[wd]
sent_split.append(ix)
new_text_split.append(sent_split)
Recurrent Neural Networks (RNNs) for Language Modeling with Keras