Pengantar RLHF

Reinforcement Learning from Human Feedback (RLHF)

Mina Parham

AI Engineer

Selamat datang di kursus!

 

  • Instruktur: Mina Parham

 

  • Insinyur AI
  • Large Language Models (LLMs)
  • Reinforcement Learning from Human Feedback (RLHF)

 

  • Topik: Reinforcement Learning from Human Feedback (RLHF)

Diagram yang merepresentasikan model AI dengan langkah tambahan melibatkan manusia.

Reinforcement Learning from Human Feedback (RLHF)

Selamat datang di kursus!

 

  • Instruktur: Mina Parham

 

  • Insinyur AI
  • Large Language Models (LLMs)
  • Reinforcement Learning from Human Feedback (RLHF)

 

  • Topik: Reinforcement Learning from Human Feedback (RLHF)

Diagram yang merepresentasikan model AI dengan langkah tambahan melibatkan manusia, menghasilkan hasil yang lebih baik.

Reinforcement Learning from Human Feedback (RLHF)

Tinjauan reinforcement learning

Diagram menampilkan ikon agen, aksi, dan kebijakan hadiah dalam siklus, mewakili proses reinforcement learning.

Reinforcement Learning from Human Feedback (RLHF)

Tinjauan reinforcement learning

Diagram menampilkan ikon agen, aksi, dan kebijakan hadiah dalam siklus, mewakili proses reinforcement learning.

Reinforcement Learning from Human Feedback (RLHF)

Tinjauan reinforcement learning

Diagram menampilkan ikon agen, aksi, dan kebijakan hadiah dalam siklus, mewakili proses reinforcement learning.

Reinforcement Learning from Human Feedback (RLHF)

Tinjauan reinforcement learning

Diagram menampilkan ikon agen, aksi, dan kebijakan hadiah dalam siklus, mewakili proses reinforcement learning.

Reinforcement Learning from Human Feedback (RLHF)

Dari RL ke RLHF

 

  Diagram menampilkan ikon LLM, keluaran teks, dan evaluator manusia, mewakili bagian dari siklus reinforcement learning dari umpan balik manusia.

Reinforcement Learning from Human Feedback (RLHF)

Dari RL ke RLHF

 

  Diagram menampilkan ikon LLM, keluaran teks, dan evaluator manusia, mewakili bagian dari siklus reinforcement learning dari umpan balik manusia.

Reinforcement Learning from Human Feedback (RLHF)

Dari RL ke RLHF

  • Melatih model hadiah
  • Selaras dengan preferensi manusia

Diagram menampilkan ikon LLM, keluaran teks, dan evaluator manusia, mewakili bagian dari siklus reinforcement learning dari umpan balik manusia.

Reinforcement Learning from Human Feedback (RLHF)

Fine-tuning LLM dalam RLHF

 

Ikon model bahasa besar.

Reinforcement Learning from Human Feedback (RLHF)

Fine-tuning LLM dalam RLHF

  • Melatih LLM awal

Ikon LLM yang di-fine-tune menggunakan dataset masukan.

Reinforcement Learning from Human Feedback (RLHF)

Proses RLHF lengkap

Sebuah prompt menanyakan "Who wrote Romeo and Juliet" masuk ke LLM.

Reinforcement Learning from Human Feedback (RLHF)

Proses RLHF lengkap

Sebuah prompt menanyakan "Who wrote Romeo and Juliet" dengan jawaban LLM: "a 16th Century author".

Reinforcement Learning from Human Feedback (RLHF)

Proses RLHF lengkap

Sebuah prompt menanyakan "Who wrote Romeo and Juliet" dengan jawaban LLM: "a 16th Century author", dan model tambahan, model kebijakan, menerima prompt.

Reinforcement Learning from Human Feedback (RLHF)

Proses RLHF lengkap

Sebuah prompt menanyakan "Who wrote Romeo and Juliet" dengan jawaban LLM: "a 16th Century author", dan model tambahan, model kebijakan, menerima prompt dan dilatih menggunakan model hadiah.

Reinforcement Learning from Human Feedback (RLHF)

Proses RLHF lengkap

Sebuah prompt menanyakan "Who wrote Romeo and Juliet" dengan jawaban LLM: "a 16th Century author", dan model tambahan, model kebijakan, dilatih menggunakan model hadiah, memberikan jawaban "William Shakespeare".

Reinforcement Learning from Human Feedback (RLHF)

Proses RLHF lengkap

Sebuah prompt menanyakan "Who wrote Romeo and Juliet" dengan jawaban LLM: "a 16th Century author", dan model tambahan, model kebijakan, dilatih menggunakan model hadiah, memberikan jawaban "William Shakespeare", serta perbandingan antara dua hasil.

Reinforcement Learning from Human Feedback (RLHF)

Berinteraksi dengan LLM yang di-tuning RLHF

  • Model RLHF pratampil di Hugging Face 🤗
from transformers import pipeline

text_generator = pipeline('text-generation', model='lvwerra/gpt2-imdb-pos-v2')
# Provide a review prompt review_prompt = "This is definitely a" # Generate the continuation output = text_generator(review_prompt, max_length=50) #Print the generated text print(output[0]['generated_text'])
This is definitely a crucial improvement.
Reinforcement Learning from Human Feedback (RLHF)

Berinteraksi dengan LLM yang di-tuning RLHF

from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer


# Instantiate the pre-trained model and tokenizer model = AutoModelForSequenceClassification.from_pretrained("lvwerra/distilbert-imdb") tokenizer = AutoTokenizer.from_pretrained("lvwerra/distilbert-imdb")
# Use pipeline to create the sentiment analyzer sentiment_analyzer = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer) # Pass the text to the sentiment analyzer and print the result sentiment = sentiment_analyzer("This is definitely a crucial improvement.")
print(f"Sentiment Analysis Result: {sentiment}")
positive
Reinforcement Learning from Human Feedback (RLHF)

Ayo berlatih!

Reinforcement Learning from Human Feedback (RLHF)

Preparing Video For Download...