The ideal monitoring workflow

Monitoring Machine Learning Concepts

Hakim Elakhrass

Co-founder and CEO of NannyML

Monitoring workflows

Traditional monitoring workflow

  • Calculate technical performance

  • Alert based on drifts in the input data

  • Results in many false alerts

Ideal monitoring workflow

  • Technical performance monitoring

    • Calculate and estimate performance
  • Root cause analysis

    • Allows to link drifts with drops in performance

 

The ideal monitoring workflow which starts with performance monitoring as the first step. If there is performance degradation, then root cause analysis is conducted, and finally, issue resolution takes place.

Monitoring Machine Learning Concepts

Monitoring performance

Involves:

  • Calculating performance - for technical metrics like accuracy

 

  • Estimating performance - if ground truth is not available

 

  • Measuring business impact - monitor key performance indicators

  The plot shows accuracy over time, where the accuracy decreases and goes beyond the given threshold.

The KPIs versus time plot shows that the KPIs remain within the specified thresholds over time.

Monitoring Machine Learning Concepts

Root Cause Analysis

The goal is to investigate:

 

  • Covariate shift - shifts in the input data distribution

 

  • Concept drift - changes in relationship between features and targets

The plot shows the decision boundary between blue circles and red circles for a working model.

The plot demonstrates the shift of blue circles crossing the decision boundary, indicating a covariate shift.

The plot illustrates the change in the decision boundary, which indicates concept drift.

Monitoring Machine Learning Concepts

Issue resolution

Possible solutions:

  • Retraining - requires additional data and compute

 

  • Refactoring the use case - take a step back and rethink used methods

 

  • Changing the downstream processes - modify processes around the prediction

  The image shows a digital brain, representing a machine learning model, and a plus sign connected to a database, indicating the process of retraining the model with more data.

The image shows a drawing board with questions related to feature engineering, the features used, and linear regression.

The image shows a worker manually verifying something on a computer.

Monitoring Machine Learning Concepts

Let's practice!

Monitoring Machine Learning Concepts

Preparing Video For Download...