Pemrosesan Bahasa Alami dengan spaCy
Azadeh Mobasher
Principal Data Scientist
What is the cheapest flight from Boston to Seattle?
Which airline serves Denver, Pittsburgh and Atlanta?
What kinds of planes are used by American Airlines?
spaCy menghitung skor kemiripan antar objek Tokennlp = spacy.load("en_core_web_md") doc1 = nlp("We eat pizza") doc2 = nlp("We like to eat pasta")token1 = doc1[2] token2 = doc2[4] print(f"Similarity between {token1} and {token2} = ", round(token1.similarity(token2), 3))
>>> Similarity between pizza and pasta = 0.685
spaCy menghitung kemiripan semantik dua objek Spandoc1 = nlp("We eat pizza") doc2 = nlp("We like to eat pasta") span1 = doc1[1:] span2 = doc2[1:]print(f"Similarity between \"{span1}\" and \"{span2}\" = ", round(span1.similarity(span2), 3))
>>> Similarity between "eat pizza" and "like to eat pasta" = 0.588
print(f"Similarity between \"{doc1[1:]}\" and \"{doc2[3:]}\" = ",
round(doc1[1:].similarity(doc2[3:]), 3))
>>> Similarity between "eat pizza" and "eat pasta" = 0.936
spaCy menghitung skor kemiripan antar dua dokumennlp = spacy.load("en_core_web_md")
doc1 = nlp("I like to play basketball")
doc2 = nlp("I love to play basketball")
print("Similarity score :", round(doc1.similarity(doc2), 3))
>>> Similarity score : 0.975
Doc defaultnya adalah rata-rata vektor kataspaCy menemukan konten relevan untuk sebuah kata kuncisentences = nlp("What is the cheapest flight from Boston to Seattle? Which airline serves Denver, Pittsburgh and Atlanta? What kinds of planes are used by American Airlines?") keyword = nlp("price")for i, sentence in enumerate(sentences.sents): print(f"Similarity score with sentence {i+1}: ", round(sentence.similarity(keyword), 5))
>>> Similarity score with sentence 1: 0.26136
Similarity score with sentence 2: 0.14021
Similarity score with sentence 3: 0.13885
Pemrosesan Bahasa Alami dengan spaCy