DVC features and use cases

Introduction to Data Versioning with DVC

Ravi Bhadauria

Machine Learning Engineer

DVC features and use cases

Covered topics

  • Versioning data and models
  • DVC Pipelines
  • Metrics and plots tracking

Advanced topics (not covered)

  • Experiment tracking
  • CI/CD for machine learning
  • Data registry
Introduction to Data Versioning with DVC

Versioning data and models

Schematic of versioning data with model by tying to specific Git commits

1 https://dvc.org/doc/use-cases/versioning-data-and-models
Introduction to Data Versioning with DVC

Pipelines

Schematic of pipeline of model training step

  • Define pipeline in dvc.yaml
stages:
  train:
    cmd: python train.py

deps: - code/train.py - data/input_data.csv - params/params.json
outs: - model_output/model.pkl
  • Run with dvc repro.
Introduction to Data Versioning with DVC

Tracking metrics and plots

$ dvc metrics diff
Path                  Metric    HEAD     workspace    Change
dvclive/metrics.json  AUC       0.78912  0.18114      -0.60798
dvclive/metrics.json  TP        215      768          553

Comparing AUC metrics across ML experiments using DVC plots

1 https://dvc.org/doc/command-reference/plots/diff
Introduction to Data Versioning with DVC

Experiment tracking

  • Run experiment and log metrics
    • dvc repro
    • dvc exp save
  • Alternatively, combine two steps dvc exp run
  • Experiments are custom Git references
    • Prevent bloating up Git commits
    • Explicit saves can be made with dvc exp save
  • Visualize using dvc exp show
Introduction to Data Versioning with DVC

CI/CD for Machine Learning

Concept image of DVC and CML used for CI/CD

1 Picture credits: https://dvc.org/doc/use-cases/ci-cd-for-machine-learning
Introduction to Data Versioning with DVC

Data registry

Image of DVC used as a data registry

1 Picture credits: https://dvc.org/doc/use-cases/data-registry
Introduction to Data Versioning with DVC

Let's practice!

Introduction to Data Versioning with DVC

Preparing Video For Download...