Fine-tuning

Large Language Models (LLMs) Concepts

Vidhi Chugh

AI strategist and ethicist

Where are we?

Progress chart showing we have reached the fine-tuning stage

Large Language Models (LLMs) Concepts

 

  • Pre-training

An image representing School children as pretraining analogy

          School education

 

  • Fine-tuning

An image representing University students as fine-tuning analogy

        University specialization
1 Freepik
Large Language Models (LLMs) Concepts

"Largeness" challenges

  • Fine-tuning can help
  • Powerful computers
  • Efficient model training methods
  • Large amounts of training data

An image showing availability of data, training time, and compute power as the challenges of building LLMs

Large Language Models (LLMs) Concepts

Computing power

  • Memory

  • Processing power

  • Infrastructure

  • Expensive

  • LLM:
    • 100,000's Central Processing Units (CPUs)
    • 10,000's Graphic Processing Units (GPUs)
  • A personal computer: 4-8 CPU and 1-2 GPUs

Man working on a computer plugged into large server

1 Freepik
Large Language Models (LLMs) Concepts

Efficient model training

Illustration symbolizing a deep learning model

  • Training time is huge

 

  • May take weeks or even months

 

  • Efficient model training = faster training time

 

  • 355 years of processing time on a single GPU
Large Language Models (LLMs) Concepts

Data availability

 

  • Need of high-quality data
  • To learn the complexities and subtleties of language
  • A few hundred gigabytes (GBs) of text data

    • More than a million books
  • Massive amount of data

  Two stack of overflowing folders to symbolize large data

Large Language Models (LLMs) Concepts

Overcoming the challenges

  • Fine-tuning
    • Addresses some of these challenges
    • Adapts a pre-trained model

 

  • Pre-trained model
    • Learned from general-purpose datasets
    • Not optimized for specific-tasks
    • Can be fine-tuned for a specific problem

People working on an oversized laptop with tools and gears to symbolize fine tuning

Large Language Models (LLMs) Concepts

Fine-tuning vs. Pre-training

  • Fine-tuning

  • Compute

    • 1-2 CPU and GPU

 

  • Training time
    • Hours to days

 

  • Data
    • ~1 gigabyte
  • Pre-training

  • Compute

    • Thousands of CPUs and GPUs

 

  • Training time
    • Weeks to months

 

  • Data
    • Hundreds of gigabytes
Large Language Models (LLMs) Concepts

Let's practice!

Large Language Models (LLMs) Concepts

Preparing Video For Download...