Introducing the transformer

Large Language Models (LLMs) Concepts

Vidhi Chugh

AI strategist and ethicist

Where are we?

Progress chart showing we are at transformer learning

Large Language Models (LLMs) Concepts

What is a transformer?

  • "Attention Is All You Need"
    • Revolutionized language modeling

 

  • Transformer architecture
    • Relationship between words
    • Components: Pre-processing, Positional Encoding, Encoders, and Decoders

Snippet of the paper "Attention is all you need"

1 arXiv: Attention Is All You Need
Large Language Models (LLMs) Concepts

Inside the transformer

 

  • Input: Jane, who lives in New York and works as a software

 

Internal components and data flow inside a transformer

 

  • Output: engineer, loves exploring new restaurants in the city.
Large Language Models (LLMs) Concepts

Transformers are like an orchestra

Image of an orchestra

Large Language Models (LLMs) Concepts

Text pre-processing and representation

  • Text preprocessing: tokenization, stop word removal, lemmatization
  • Text representation: word embedding

Highlight of the first component of a transformer and some individual notes

Large Language Models (LLMs) Concepts

Positional encoding

  • Information on the position of each word
  • Understand distant words

Highlight of the second component of a transformer and a piece of music

Large Language Models (LLMs) Concepts

Encoders

  • Attention mechanism: directs attention to specific words and relationships
  • Neural network: process specific features

Encoder in the transformer flow

Large Language Models (LLMs) Concepts

Decoders

  • Includes attention and neural networks
  • Generates the output

Decoder component of a transformer

Large Language Models (LLMs) Concepts

Transformers and long-range dependencies

 

  • Initial challenge: long-range dependency
  • Attention: focus on different parts of the input

 

  • Example: "Jane, who lives in New York and works as a software engineer, loves exploring new restaurants in the city."

  • "Jane" --- "loves exploring new restaurants"

Large Language Models (LLMs) Concepts

Processes multiple parts simultaneously

  • Limitation of traditional language models:
    • Sequential - one word at a time

 

  • Transformers:
    • Process multiple parts simultaneously
    • Faster processing

 

  • For example:
    • "The cat sat on the mat"
    • Processes "cat," "sat," "on," "the," and "mat" at the same time
Large Language Models (LLMs) Concepts

Let's practice!

Large Language Models (LLMs) Concepts

Preparing Video For Download...