Understanding the transformer

Introduction to LLMs in Python

Jasmin Ludolf

Senior Data Science Content Developer, DataCamp

What is a transformer?

  • Deep learning architectures
  • Processing, understanding, and generating text
  • Used in most LLMs
  • Handle long text sequences in parallel

Illustration of three transformer architectures: encoder-only, decoder-only, and encoder-decoder

Introduction to LLMs in Python

Transformer architectures

Illustration of three transformer architectures: encoder-only, decoder-only, and encoder-decoder

  • Find the architecture details in the Hugging Face model card
Introduction to LLMs in Python

Encoder-only

Encoder-only illustration

  • Understanding the input text
  • No sequential output
  • Common tasks:
    • Text classification
    • Sentiment analysis
    • Extractive question-answering (extract or label)
  • BERT models
  • Example: "distilbert-base-uncased-distilled-squad"
Introduction to LLMs in Python

Encoder-only

Encoder-only illustration

llm = pipeline(model="bert-base-uncased")
print(llm.model)
BertForMaskedLM(
  (bert): ...
    )
    (encoder): BertEncoder(
      ...
print(llm.model.config)
BertConfig {
...
  "architectures": [
    "BertForMaskedLM"
...
Introduction to LLMs in Python

Encoder-only

Encoder-only illustration

print(llm.model.config.is_decoder)
False
  • Alternatively: llm.model.config.is_encoder_decoder
Introduction to LLMs in Python

Decoder-only

Decoder-only illustration

  • Focus shifts to output
  • Common tasks:
    • Text generation
    • Generative question-answering (sentence(s) or paragraph(s))
  • GPT models
  • Example: "gpt-3.5-turbo"
Introduction to LLMs in Python

Decoder-only

Decoder-only illustration

llm = pipeline(model="gpt2")
print(llm.model.config)
GPT2Config {
...
  "architectures": [
    "GPT2LMHeadModel"
  ],
...
  "task_specific_params": {
    "text-generation": {
...
print(llm.model.config.is_decoder)
False
Introduction to LLMs in Python

Encoder-decoder

Encoder-decoder illustration

  • Understand and process the input and output
  • Common tasks:
    • Translation
    • Summarization
  • T5, BART models
Introduction to LLMs in Python

Encoder-decoder

Encoder-decoder illustration

llm = pipeline(model="Helsinki-NLP/opus-mt-es-en")
print(llm.model)
MarianMTModel(
...
    (encoder): MarianEncoder(
...
    (decoder): MarianDecoder(
...
Introduction to LLMs in Python

Encoder-decoder

Encoder-decoder illustration

print(llm.model.config)
MarianConfig {
...
  "decoder_attention_heads": 8,
...
  "encoder_attention_heads": 8,
...
  "is_encoder_decoder": true,
...
print(llm.model.config.is_encoder_decoder)
True
Introduction to LLMs in Python

Let's practice!

Introduction to LLMs in Python

Preparing Video For Download...