Vraagsimilariteit en grammaticale correctheid

Natural Language Processing (NLP) in Python

Fouad Trad

Machine Learning Engineer

Vraagsimilariteit

  • Bepaalt of twee vragen parafrases zijn
  • Handig voor:
    • Duplicaten verwijderen
    • Vergelijkbare vragen clusteren
    • Zoeknauwkeurigheid verbeteren
  • Met modellen getraind op het Quora Question Pairs (QQP)-dataset

Afbeelding met drie mensen die vragen stellen.

Natural Language Processing (NLP) in Python

QQP-pijplijn

from transformers import pipeline

qqp_pipeline = pipeline( task="text-classification", model="textattack/bert-base-uncased-QQP" )
question1 = "How can I learn Python?" question2 = "What is the best way to study Python?"
result = qqp_pipeline({"text": question1, "text_pair": question2})
print(result)
{'label': 'LABEL_1', 'score': 0.6853412985801697}
Natural Language Processing (NLP) in Python

QQP-pijplijn

from transformers import pipeline
qqp_pipeline = pipeline(
    task="text-classification", 
    model="textattack/bert-base-uncased-QQP"
    )
question1 = "How can I learn Python?"
question2 = "What is the capital of France?"
result = qqp_pipeline({"text": question1, "text_pair": question2})
print(result)
{'label': 'LABEL_0', 'score': 0.9999338388442993}
Natural Language Processing (NLP) in Python

Grammaticale correctheid beoordelen

  • Beoordeel hoe grammaticaal correct tekst is
  • Handig voor:

    • Onderwijstools
    • Grammatica-controle
    • Schrijfassistenten
  • Met modellen getraind op het Corpus of Linguistic Acceptability (CoLA)-dataset

Afbeelding van iemand die de correctheid van een geschreven tekst beoordeelt.

Natural Language Processing (NLP) in Python

CoLA-pijplijn

from transformers import pipeline
cola_classifier = pipeline(
  task="text-classification", 
  model="textattack/distilbert-base-uncased-CoLA"
)

result = cola_classifier("The cat sat on the mat.")
print(result)
[{'label': 'LABEL_1', 'score': 0.9918296933174133}]
Natural Language Processing (NLP) in Python

CoLA-pijplijn

from transformers import pipeline
cola_classifier = pipeline(
  task="text-classification", 
  model="textattack/distilbert-base-uncased-CoLA"
)
result = cola_classifier("The cat on sat mat the.")
print(result)
[{'label': 'LABEL_0', 'score': 0.9628171324729919}]
Natural Language Processing (NLP) in Python

Laten we oefenen!

Natural Language Processing (NLP) in Python

Preparing Video For Download...