Validation

Developing AI Systems with the OpenAI API

Francesca Donadoni

Curriculum Manager, DataCamp

Validation

A developer testing code on multiple screens

Developing AI Systems with the OpenAI API

Validation

 

Potential for model errors:

  • Misinterpreting context
  • Amplifying biases in its outputs if the training data is biased
  • Output of outdated information
  • Being manipulated to generate harmful or unethical content
  • Inadvertently revealing sensitive information
Developing AI Systems with the OpenAI API

Adversarial testing

A diagram with a programmer injecting adversarial input to the data and model, and the model inferring from the data

1 Adapted from https://adversarial-robustness-toolbox.readthedocs.io/en/latest/
Developing AI Systems with the OpenAI API

Adversarial testing

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
{"role": "system",
 "content": "You are an AI assistant for the film industry. You should interpret 
    the user prompt, a movie review, and based on that extract whether its 
    sentiment is positive, negative, or neutral."},

{"role": "user", "content": "It was great to see some of my favorite stars of 30 years ago including John Ritter, Ben Gazarra and Audrey Hepburn. They looked quite wonderful. But that was it. They were not given any characters or good lines to work with. I neither understood or cared what the characters were doing."}])
1 https://huggingface.co/datasets/davanstrien/test1?row=10
Developing AI Systems with the OpenAI API

Adversarial testing

print(response.choices[0].message.content)
The sentiment of this movie review is negative.
Developing AI Systems with the OpenAI API

Adversarial testing

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
{"role": "system",
 "content": "You are an AI assistant for the film industry. You should interpret 
    the user prompt, a movie review, and based on that extract whether its sentiment 
    is positive, negative, or neutral."},

{"role": "user", "content": "If you read the book, your all set. If you didn't...your still all set."}]) print(response.choices[0].message.content)
The sentiment of this movie review is neutral.
Developing AI Systems with the OpenAI API

Evaluation libraries and datasets

A diagram showing an example evaluation library using a variety of datasets to test a model

1 https://github.com/openai/evals
Developing AI Systems with the OpenAI API

Let's practice!

Developing AI Systems with the OpenAI API

Preparing Video For Download...