Fine-tuning

Large Language Models (LLMs) Concepts

Vidhi Chugh

AI strategist and ethicist

Where are we?

Progress chart showing we have reached the fine-tuning stage

Pre-training

An image representing School children as pretraining analogy

          School education

Fine-tuning

An image representing University students as fine-tuning analogy

        University specialization

¹ Freepik

"Largeness" challenges

Fine-tuning can help
Powerful computers
Efficient model training methods
Large amounts of training data

An image showing availability of data, training time, and compute power as the challenges of building LLMs

Computing power

Memory
Processing power
Infrastructure
Expensive

LLM:
- 100,000's Central Processing Units (CPUs)
- 10,000's Graphic Processing Units (GPUs)

A personal computer: 4-8 CPU and 1-2 GPUs

Man working on a computer plugged into large server

¹ Freepik

Efficient model training

Illustration symbolizing a deep learning model

Training time is huge

May take weeks or even months

Efficient model training = faster training time

355 years of processing time on a single GPU

Data availability

Need of high-quality data

To learn the complexities and subtleties of language

A few hundred gigabytes (GBs) of text data
- More than a million books
Massive amount of data

Two stack of overflowing folders to symbolize large data

Overcoming the challenges

Fine-tuning
- Addresses some of these challenges
- Adapts a pre-trained model

Pre-trained model
- Learned from general-purpose datasets
- Not optimized for specific-tasks
- Can be fine-tuned for a specific problem

People working on an oversized laptop with tools and gears to symbolize fine tuning

Fine-tuning vs. Pre-training

Fine-tuning
Compute
- 1-2 CPU and GPU

Training time
- Hours to days

Data
- ~1 gigabyte

Pre-training
Compute
- Thousands of CPUs and GPUs

Training time
- Weeks to months

Data
- Hundreds of gigabytes

Let's practice!

Large Language Models (LLMs) Concepts

Preparing Video For Download...