Model design and data collection

Generative AI Concepts

Daniel Tedesco

Data Lead, Google

Know how to fill the tank

A fuel gauge pointing to empty

A factory line worker installing the hood of a vehicle

1 GM Fairfax Assembly Plant
Generative AI Concepts

Developing a model

Model Development Steps

  1. Research and design
  2. Training data collection
  3. Model training
  4. Model evaluation
Generative AI Concepts

Stable Diffusion's research and development

Example output from Stable Diffusion

User generated image from Stable Diffusion Beta

Stable Diffusion's R&D

  • Purpose: Decide on image generation
  • Architecture: Settle on diffusion model
  • Resources: 256 GPUs, 150k hours, $600k
1 Stability AI, Emad Mostaque Twitter post
Generative AI Concepts

Data collection: not your typical ML model

Training data preparation

  • Massive amounts required
  • Diverse, context-rich data
  • Requires preprocessing

Series of cat images from a large image dataset.

1 Laion blog
Generative AI Concepts

Data collection: privacy and security are critical

 

Training data preparation

Personally Identifiable Information (PII)

  • Anonymize or aggregate
  • Store in secure location with controlled access

 

Image with faces blurred

Generative AI Concepts

Let's practice!

Generative AI Concepts

Preparing Video For Download...