Tugas pembangkitan urutan

Natural Language Processing (NLP) in Python

Fouad Trad

Machine Learning Engineer

Pembangkitan urutan

  • Menghasilkan teks baru dari input
  • Mencakup tugas:
    • Perangkuman teks
    • Penerjemahan teks
    • Pemodelan bahasa

GIF pensil menulis kalimat di kertas.

Natural Language Processing (NLP) in Python

Perangkuman teks

  • Merangkum dokumen panjang menjadi versi singkat dengan poin kunci
  • Berguna untuk:
    • Artikel berita panjang
    • Makalah riset
    • Laporan
    • Email

Gambar tumpukan buku dan dokumen ringkasan yang dihasilkan darinya.

Natural Language Processing (NLP) in Python

Pipeline perangkuman teks

from transformers import pipeline

summarizer = pipeline(task="summarization", model="cnicu/t5-small-booksum")
text = """The Amazon rainforest, often referred to as the "lungs of the Earth," is one of the most biologically diverse regions in the world. Spanning over nine countries in South America, the majority of the forest lies in Brazil. It is home to an estimated 390 billion individual trees, divided into 16,000 different species. The rainforest plays a critical role in regulating the global climate by absorbing vast amounts of carbon dioxide and producing oxygen."""
result = summarizer(text)
print(result)
[{'summary_text': 'the Amazon rainforest is one of the most biologically diverse regions in the world. 
The majority of the forest lies in Brazil. The rainforest plays a critical role in regulating the 
global climate by absorbing vast amounts of carbon dioxide and producing oxygen.'}]
Natural Language Processing (NLP) in Python

Penerjemahan teks

  • Mengonversi teks dari satu bahasa ke bahasa lain
  • Penting untuk aplikasi multibahasa:
    • Situs internasional
    • Alat dukungan pelanggan

Gambar teks diterjemahkan dari Inggris (Good morning) ke Prancis (Bonjour).

Natural Language Processing (NLP) in Python

Pipeline penerjemahan teks

translator = pipeline(task="translation", model="Helsinki-NLP/opus-mt-en-fr")

sentence = "The rainforest helps regulate the Earth's climate."
result = translator(sentence)
print(result)
[{'translation_text': 'La forêt tropicale aide à réguler le climat de la Terre.'}]
Natural Language Processing (NLP) in Python

Pemodelan bahasa

  • Memprediksi kata berikutnya dari prompt
  • Dasar banyak aplikasi:
    • Pelengkapan otomatis
    • Pembuatan cerita
    • Respons chatbot

Gambar yang menunjukkan cara kerja pemodelan bahasa: mesin menerima prompt dan menghasilkan teks.

Natural Language Processing (NLP) in Python

Pipeline pemodelan bahasa

generator = pipeline(task="text-generation", model="distilgpt2")


prompt = "Once upon a time,"
result = generator(prompt, max_length=30, num_return_sequences=3)
print(result)
[{'generated_text': "Once upon a time, my life wasn't so good, I kept my things tidy. The 
                     more time I spend with my children the more ..."}, 
 {'generated_text': 'Once upon a time, we began a process of finding the right answers to 
                     some big questions," said Jim Pelterer, a lecturer at the 
                     University...'}, 
 {'generated_text': 'Once upon a time, a man came along and took in the city, and found out 
                     that a strange woman had just walked in and was dancing about...'}]
Natural Language Processing (NLP) in Python

Ayo berlatih!

Natural Language Processing (NLP) in Python

Preparing Video For Download...