CI/CD untuk Machine Learning
Ravi Bhadauria
Machine Learning Engineer

Rangkaian stage yang mendefinisikan alur kerja ML dan dependensi
Didefinisikan di berkas dvc.yaml
deps)cmd)outs)metrics dan plotsMirip workflow GitHub Actions
dvc stage adddvc stage add \
-n preprocess \
-d raw_data.csv -d preprocess.py \
-o processed_data.csv \
python preprocess.py
dvc.yamlstages:
preprocess:
cmd: python preprocess.py
deps:
- preprocess.py
- raw_data.csv
outs:
- processed_data.csv
dvc stage add \
-n train \
-d train.py -d processed_data.csv \
-o plots.png -o metrics.txt \
python train.py
stages: preprocess: cmd: python preprocess.py deps: - preprocess.py - raw_data.csv outs: - processed_data.csvtrain: cmd: python train.py deps: - processed_data.csv - train.py outs: - plots.png
dvc repro-> dvc repro Menjalankan stage 'preprocess': > python preprocess.pyMenjalankan stage 'train': > python train.py Memperbarui berkas kunci 'dvc.lock'
dvc.lock dibuat.dvc, menyimpan hash MD5git add dvc.lock && git commit -m "first pipeline repro"`-> dvc repro
Stage 'preprocess' tidak berubah, dilewati
Menjalankan stage 'train' dengan perintah: ...
-> dvc dag
+------------+
| preprocess |
+------------+
*
*
*
+-------+
| train |
+-------+
dvc.yaml dan dvc.lockdvc stage adddvc reprodvc dagCI/CD untuk Machine Learning