Natural Language Processing met spaCy
Azadeh Mobasher
Principal Data Scientist
What is the cheapest flight from Boston to Seattle?
Which airline serves Denver, Pittsburgh and Atlanta?
What kinds of planes are used by American Airlines?
spaCy berekent similariteitsscores tussen Token-objectennlp = spacy.load("en_core_web_md") doc1 = nlp("We eat pizza") doc2 = nlp("We like to eat pasta")token1 = doc1[2] token2 = doc2[4] print(f"Similarity between {token1} and {token2} = ", round(token1.similarity(token2), 3))
>>> Similarity between pizza and pasta = 0.685
spaCy berekent semantische gelijkenis van twee Span-objectendoc1 = nlp("We eat pizza") doc2 = nlp("We like to eat pasta") span1 = doc1[1:] span2 = doc2[1:]print(f"Similarity between \"{span1}\" and \"{span2}\" = ", round(span1.similarity(span2), 3))
>>> Similarity between "eat pizza" and "like to eat pasta" = 0.588
print(f"Similarity between \"{doc1[1:]}\" and \"{doc2[3:]}\" = ",
round(doc1[1:].similarity(doc2[3:]), 3))
>>> Similarity between "eat pizza" and "eat pasta" = 0.936
spaCy berekent similariteitsscores tussen twee documentennlp = spacy.load("en_core_web_md")
doc1 = nlp("I like to play basketball")
doc2 = nlp("I love to play basketball")
print("Similarity score :", round(doc1.similarity(doc2), 3))
>>> Similarity score : 0.975
Doc-vectoren zijn standaard het gemiddelde van woordvectorenspaCy vindt content die past bij een gegeven trefwoordsentences = nlp("What is the cheapest flight from Boston to Seattle? Which airline serves Denver, Pittsburgh and Atlanta? What kinds of planes are used by American Airlines?") keyword = nlp("price")for i, sentence in enumerate(sentences.sents): print(f"Similarity score with sentence {i+1}: ", round(sentence.similarity(keyword), 5))
>>> Similarity score with sentence 1: 0.26136
Similarity score with sentence 2: 0.14021
Similarity score with sentence 3: 0.13885
Natural Language Processing met spaCy