The metadata store

Fully Automated MLOps

Arturo Opsetmoen Amador

Senior Consultant - Machine Learning

What is metadata in MLOps?

  • Metadata is the information about the artifacts
    • created during the execution of different components of an ML pipeline

A machine learning pipeline is presented. The pipeline has the following steps: data extraction, data validation, model training, model evaluation, and model validation.

Metadata examples:

  • Data versioning: Different versions of the same data are kept
  • Metadata about training artifacts such as hyperparameters
  • Pipeline execution logs
1 https://datacentricai.org/
Fully Automated MLOps

Important aspects of metadata in ML

$$

  • Data lineage

  • Reproducibility

  • Monitoring

  • Regulatory

Fully Automated MLOps

The importance of metadata - Data lineage

  • Data lineage metadata tracks the information about data:

    • from its creation point
    • to the points of consumption

An illustration of the data lineage concept. It shows how the data can be transformed by several processes between its origin and its consumption.

Fully Automated MLOps

The importance of metadata - Reproducibility

  • The metadata about our machine learning experiments:
    • Allows others to reproduce our results
    • Increases trust in our ML systems
    • Introduces scientific rigor to our ML process

A figure to illustrate the concept of reproducibility. It shows a series of scientists looking at each other working through a telescope. They do this because they wish to reproduce each other's work.

Fully Automated MLOps

The importance of metadata - Monitoring

  • It allows machine learning engineers to
    • Follow the execution of the different parts of the MLOps pipeline
    • Check the status of the ML system at any time
Fully Automated MLOps

Example monitoring tool

An example of a monitoring dashboard showing different metrics associated to a running MLOps system.

1 https://cloud.google.com/stackdriver/docs/solutions/gke/observing
Fully Automated MLOps

The metadata store

  • Centralized place to manage all the MLOps metadata about:

    • experiments (logs)
    • artifacts
    • models
    • pipelines
  • It has a user interface that allows us to:

    • read and write all model-related metadata

An illustration of the metadata store. It shows a centralized store where several actors and processes access metadata.

1 https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
Fully Automated MLOps

The metadata store in an MLOps architecture

A figure of the fully automated MLOps architecture. It shows the central role of the metadata store in the MLOps process.

Fully Automated MLOps

Metadata store in fully automated MLOps

It enables the automatic monitoring of the functioning of the fully automated MLOps pipeline

  • Facilitating automatic incident response. For example,

    • Automatic model re-training
    • Automatic rollbacks

A plot representing model retraining triggered by drift in the ML system. It shows a figure where the x-axis is time and the y-axis is performance. Three curves are presented. Each curve has a decay representing drift. The drift point is circled and tagged as drift detected. At this point a new performance curve starts for a new model.

Fully Automated MLOps

Let's practice!

Fully Automated MLOps

Preparing Video For Download...