Introducing the transformer

Concetti sui Large Language Models (LLM)

Vidhi Chugh

AI strategist and ethicist

Where are we?

Progress chart showing we are at transformer learning

Concetti sui Large Language Models (LLM)

What is a transformer?

  • "Attention Is All You Need"
    • Revolutionized language modeling

 

  • Transformer architecture
    • Relationship between words
    • Components: Pre-processing, Positional Encoding, Encoders, and Decoders

Snippet of the paper "Attention is all you need"

1 arXiv: Attention Is All You Need
Concetti sui Large Language Models (LLM)

Inside the transformer

 

  • Input: Jane, who lives in New York and works as a software

 

Internal components and data flow inside a transformer

 

  • Output: engineer, loves exploring new restaurants in the city.
Concetti sui Large Language Models (LLM)

Transformers are like an orchestra

Image of an orchestra

Concetti sui Large Language Models (LLM)

Text pre-processing and representation

  • Text preprocessing: tokenization, stop word removal, lemmatization
  • Text representation: word embedding

Highlight of the first component of a transformer and some individual notes

Concetti sui Large Language Models (LLM)

Positional encoding

  • Information on the position of each word
  • Understand distant words

Highlight of the second component of a transformer and a piece of music

Concetti sui Large Language Models (LLM)

Encoders

  • Attention mechanism: directs attention to specific words and relationships
  • Neural network: process specific features

Encoder in the transformer flow

Concetti sui Large Language Models (LLM)

Decoders

  • Includes attention and neural networks
  • Generates the output

Decoder component of a transformer

Concetti sui Large Language Models (LLM)

Transformers and long-range dependencies

 

  • Initial challenge: long-range dependency
  • Attention: focus on different parts of the input

 

  • Example: "Jane, who lives in New York and works as a software engineer, loves exploring new restaurants in the city."

  • "Jane" --- "loves exploring new restaurants"

Concetti sui Large Language Models (LLM)

Processes multiple parts simultaneously

  • Limitation of traditional language models:
    • Sequential - one word at a time

 

  • Transformers:
    • Process multiple parts simultaneously
    • Faster processing

 

  • For example:
    • "The cat sat on the mat"
    • Processes "cat," "sat," "on," "the," and "mat" at the same time
Concetti sui Large Language Models (LLM)

Let's practice!

Concetti sui Large Language Models (LLM)

Preparing Video For Download...