Word vectors

Building Chatbots in Python

Alan Nichol

Co-founder and CTO, Rasa

Machine learning

  • Programs which can get better at a task by being exposed to more data
  • Identifying which intent a user message belongs to
Building Chatbots in Python

Vector representations

"can you help me please?"

Units examples vectors
characters "c", "a", "n",... v_c, v_a, v_n, ...
words "can", "you", ... v_{can}, v_{you}, ...
sentences "can you help..." v_{can you help ...}
Building Chatbots in Python

Word vectors

Context Candidates
let's meet at the ___ tomorrow office, gym, park, beach, party
I love going to the ___ to play with the dogs beach, park
  • Word vectors try to represent meaning of words
  • Words which appear in similar context have similar vectors
Building Chatbots in Python

Word vectors are computationally intensive

  • Training word vectors requires a lot of data
  • High quality word vectors are available for anyone to use
  • GloVe algorithm
    • Cousin of word2vec
  • spaCy
Building Chatbots in Python
import spacy

nlp = spacy.load('en')
nlp.vocab.vectors_length
300
doc = nlp('hello can you help me?')

for token in doc: print("{} : {}".format(token, token.vector[:3]))
hello : [ 0.25233001  0.10176    -0.67484999]
can : [-0.23857     0.35457    -0.30219001]
you : [-0.11076     0.30785999 -0.51980001]
help : [-0.29370001  0.32253    -0.44779   ]
me : [-0.15396     0.31894001 -0.54887998]
? : [-0.086864    0.19160999  0.10915   ]
Building Chatbots in Python

Similarity

  • Direction of vectors matters
  • "Distance" between words = angle between the vectors
  • Cosine similarity
    • 1: If vectors point in the same direction
    • 0: If they are perpendicular
    • -1: If they point in opposite directions
Building Chatbots in Python

.similarity()

  • "can" and "cat" are spelled similarly but have low similarity
  • but "cat" and "dog" have high similarity
import spacy
nlp = spacy.load('en')

doc = nlp("cat")
doc.similarity(nlp("can"))
0.30165292161215396
doc.similarity(nlp("dog"))
0.80168555173294953
Building Chatbots in Python

Let's practice!

Building Chatbots in Python

Preparing Video For Download...