Evaluating responses

Comprender la ingeniería de prompts

Alex Banks

Founder & Educator

Introduction to evaluating responses

Every tool has its limitations. ChatGPT's has a knowledge cut off. Solution: smart prompting.

1 Images source: DALLE-3
Comprender la ingeniería de prompts

The four cornerstones of evaluation

LARF

  • Logical consistency
  • Accuracy
  • Relevance
  • Factual correctness

Happy, smiling older person

1 Images source: DALLE-3
Comprender la ingeniería de prompts

Logical consistency - the coherence check

Person working on solar panel and the prompt "what are the benefits and drawbacks of solar energy"

1 Images source: DALLE-3
Comprender la ingeniería de prompts

Logical consistency - the coherence check

Person working on solar panel and the prompt "what are the benefits and drawbacks of solar energy"

List of benefits

1 Images source: DALLE-3
Comprender la ingeniería de prompts

Logical consistency - the coherence check

Person working on solar panel and the prompt "what are the benefits and drawbacks of solar energy"

List of drawbacks

1 Images source: DALLE-3
Comprender la ingeniería de prompts

Accuracy and the hallucination tendency

Hallucination -> confidently state an incorrect answer.  

Incorrect answer to the question "who was the first person to walk the moon"

1 Images source: DALLE-3
Comprender la ingeniería de prompts

Relevance - meeting the context

Relevance -> response aligns with the context and intent of the prompt.   What are the top tourist attractions in Paris

1 Images source: DALLE-3
Comprender la ingeniería de prompts

Relevance - meeting the context

What are the top tourist attractions in Paris with incorrect answer highlighted

1 Images source: DALLE-3
Comprender la ingeniería de prompts

Factual correctness beyond the cutoff date

Are universal basic income trials successful in reducing poverty? Provice your answer by only referencing and citing reliable sources.

Comprender la ingeniería de prompts

Factual correctness beyond the cutoff date

ChatGPT cut off dates

1 Images source: ChatGPT
Comprender la ingeniería de prompts

Let's practice!

Comprender la ingeniería de prompts

Preparing Video For Download...