Transfer learning for language models

Recurrent Neural Networks (RNNs) for Language Modeling with Keras

David Cecchini

Data Scientist

The idea behind transfer learning

Transfer learning:

  • Start with better than random initial weights
  • Use models trained on very big datasets
  • "Open-source" data science models
Recurrent Neural Networks (RNNs) for Language Modeling with Keras

Available architectures

Base example: I really loved this movie

  • Word2Vec
    • Continuous Bag of Words (CBOW) X = [I, really, this, movie], y = loved
    • Skip-gram X = loved, y = [I, really, this, movie]
  • FastText X = [I, rea, eal, all, lly, really, ...], y = loved
    • Uses words and n-grams of chars
  • ELMo X = [I, really, loved, this], y = movie
    • Uses words, embeddings per context
    • Uses Deep bidirectional language models (biLM)
  • Word2Vec and FastText are available on package gensim and ELMo on tensorflow_hub
Recurrent Neural Networks (RNNs) for Language Modeling with Keras

Example using Word2Vec

from gensim.models import word2vec

# Train the model w2v_model = word2vec.Word2Vec(tokenized_corpus, size=embedding_dim, window=neighbor_words_num, iter=100)
# Get top 3 similar words to "captain" w2v_model.wv.most_similar(["captain"], topn=3)
[('sweatpants', 0.7249663472175598),
('kirk', 0.7083336114883423),
('larry', 0.6495886445045471)]
Recurrent Neural Networks (RNNs) for Language Modeling with Keras

Example using FastText

from gensim.models import fasttext

# Instantiate the model ft_model = fasttext.FastText(size=embedding_dim, window=neighbor_words_num)
# Build vocabulary ft_model.build_vocab(sentences=tokenized_corpus)
# Train the model ft_model.train(sentences=tokenized_corpus, total_examples=len(tokenized_corpus), epochs=100)
Recurrent Neural Networks (RNNs) for Language Modeling with Keras

Let's practice!

Recurrent Neural Networks (RNNs) for Language Modeling with Keras

Preparing Video For Download...