Moderasi

Mengembangkan Sistem AI dengan OpenAI API

Francesca Donadoni

Curriculum Manager, DataCamp

Memahami moderasi di OpenAI API

  • Moderasi: proses menganalisis input untuk menentukan apakah berisi konten yang melanggar kebijakan atau pedoman tertentu Diagram dengan pesan pengguna dibaca oleh OpenAI moderation API dan menghasilkan probabilitas pesan termasuk kategori konten berbahaya
Mengembangkan Sistem AI dengan OpenAI API

Memahami moderasi di OpenAI API

Diagram dengan pesan pengguna dibaca oleh OpenAI moderation API dan menghasilkan respons berisi daftar kategori konten berbahaya yang dipertimbangkan

Mengembangkan Sistem AI dengan OpenAI API

Memoderasi konten

moderation_response = client.moderations.create(input="""
...until someone draws an Exploding Kitten.
When that happens, that person explodes. They are now dead.
This process continues until...
""") 

print(moderation_response.results[0].categories.violence)
True
1 https://ek.explodingkittens.com/how-to-play/exploding-kittens
Mengembangkan Sistem AI dengan OpenAI API

Moderasi dalam konteks

moderation_response = client.moderations.create(input="""
In the deck of cards are some Exploding Kittens. You play the game by putting the deck face down and taking turns drawing cards until someone draws an Exploding Kitten.
When that happens, that person explodes. They are now dead.
This process continues until there’s only 1 player left, who wins the game.
The more cards you draw, the greater your chances of drawing an Exploding Kitten.
""") 

moderation_response.results[0].categories.violence
False
Mengembangkan Sistem AI dengan OpenAI API

Prompt injection

Seorang wanita memakai chatbot dengan prompt berbahaya yang disuntikkan

Mengembangkan Sistem AI dengan OpenAI API

Prompt injection

 

  • Membatasi jumlah teks dalam prompt
  • Membatasi jumlah token output yang dihasilkan
  • Menggunakan konten terpilih sebagai input dan output tervalidasi
Mengembangkan Sistem AI dengan OpenAI API

Menambahkan guardrail

user_request = """
In the deck of cards are some Exploding Kittens. You play the game by putting the 
deck face down and taking turns drawing cards until  someone draws an Exploding 
Kitten. When that happens, that person explodes. They are now dead.
This process continues until there’s only 1 player left, who wins the game.
The more cards you draw, the greater your chances of drawing an Exploding Kitten.
"""

messages = [{"role": "system", "content": "Your role is to assess whether the user question is allowed or not. The allowed topics are games of chess only. If the topic is allowed, reply with an answer as normal, otherwise say 'Apologies, but the topic is not_allowed.'",}, {"role": "user", "content": user_request},]
Mengembangkan Sistem AI dengan OpenAI API

Menambahkan guardrail

response = client.chat.completions.create(
    model="gpt-4o-mini", 
    messages=messages
)

print(response.choices[0].message.content)
Apologies, but the topic is not allowed.
Mengembangkan Sistem AI dengan OpenAI API

Ayo berlatih!

Mengembangkan Sistem AI dengan OpenAI API

Preparing Video For Download...