Introduction to the course

Recurrent Neural Networks (RNNs) for Language Modeling with Keras

David Cecchini

Data Scientist

Text data is available online

Displays examples of text data available on the Internet, news articles, big Internet companies, etc.

Applications of machine learning to text data

Four applications:

Sentiment analysis
Multi-class classification
Text generation
Machine neural translation

Sentiment analysis

Shows two Emojis, one happy face and one disapprove face

Multi-class classification

Have representations of classes for movie genres, science fiction, romance, etc.

Text generation

Example of a machine generated text in a conversation app. When receiving a query from a connection, the app suggests answers.

Neural machine translation

Displays an example of translation between Portuguese phrase "Vamos jogar futebol esse domingo?" to the English phrase "Let's play soccer this Sunday?"

Recurrent Neural Networks

Displays the unrolled representation of the RNN model, remarking that the weights between every word are shared. In this representation, the words are entered in the model in sequence up until the last one, where the model makes the final prediction of the class.

Sequence to sequence models

Many to one: classification

RNN model architecture for classification problems. The model makes a prediction only in the last word.

Sequence to sequence models

Many to many: text generation

Text generation model architecture. Here, the model is divided in two parts, the input text and the generated text. In the input text part, the model makes the prediction only in the final step, while in the generated part, it makes prediction on every step, using the first prediction as input to the second and so on.

Sequence to sequence models

Many to many: neural machine translation

Neural Machine Translation architecture. This architecture is divided in an encoder and decoder parts for the input and output language, respectively. The encoder part learns a language model for the input language and the decoder part learns a language model for the output language. The final state of the encoder is passed to the decoder, that has no other input.

Sequence to sequence models

Many to many: language model

Language Model architecture. In this case, the model tries to predict the next input word on each step. So, the model predicts on every step and in the training phase it will adjust the weights in order to predict the next word in a sequence.

Let's practice!

Recurrent Neural Networks (RNNs) for Language Modeling with Keras