Building blocks to train LLMs

Concetti sui Large Language Models (LLM)

Vidhi Chugh

AI strategist and ethicist

Where are we?

Image showing learning progress as Pre-training

Concetti sui Large Language Models (LLM)

Generative pre-training

 

  • Trained using generative pre-training

    • Input data of text tokens
    • Trained to predict the tokens within the dataset

 

  • Types:
    • Next word prediction
    • Masked language modeling
Concetti sui Large Language Models (LLM)

Next word prediction

  • Supervised learning technique
    • Model trained on input-output pairs

 

  • Predicts next word and generates coherent text
  • Captures the dependencies between words

 

  • Training Data
    • Pairs of input and output examples

Auto-suggestion by a search engine

Concetti sui Large Language Models (LLM)

Training data for next word prediction

Input

The quick brown

The quick brown fox

The quick brown fox jumps

The quick brown fox jumps over

The quick brown fox jumps over the

The quick brown fox jumps over the lazy

The quick brown fox jumps over the lazy dog.

Output

fox

jumps

over

the

lazy

dog

Concetti sui Large Language Models (LLM)

Which word relates more with pizza?

 

  • More examples = better prediction

 

  • For example:
    • I love to eat pizza with _ _ _ _ _ _

 

  • Cheese is more related with pizza than anything else

Probabilities of different words association with the word "Pizza"

Concetti sui Large Language Models (LLM)

Masked language modeling

  • Hides a selective word

  • Trained model predicts the masked word

 

  • Original Text: "The quick brown fox jumps over the lazy dog."

  • Masked Text: "The quick [MASK] fox jumps over the lazy dog."

 

  • Objective: predict the missing word

  • Based on learnings from training data

Concetti sui Large Language Models (LLM)

Let's practice!

Concetti sui Large Language Models (LLM)

Preparing Video For Download...