Model reliability

Developing Machine Learning Models for Production

Sinan Ozdemir

Data Scientist, Entrepreneur, and Author

Aligning ML models with business impact metrics

  • Business impact metrics measure the impact of ML on the business
    • e.g. revenue, cost savings, customer satisfaction score (CSAT)
  • Should be aligned with model metrics
    • e.g. churn predictor --> revenue
    • e.g. Manufacturing maintenance predictor --> cost savings
    • e.g. accuracy of chatbot intent detection --> CSAT

graph going up

Developing Machine Learning Models for Production

Testing routines in ML pipelines

  • Unit tests test individual components
    • e.g. test that a PCA instance returns the expected number of features
  • Integration tests consider the entire pipeline
    • e.g. test that the input data is correctly preprocessed, the model makes accurate predictions, and the output is correctly post-processed
  • Smoke tests are quick tests that give you confidence that the system is working
    • e.g. test that the model can correctly classify a small set of sample images
  • Test early and test often
    • e.g. test the model on new data as soon as it becomes available
Developing Machine Learning Models for Production

Example unit test

def test_pipeline():
    # Generate mock data for testing
    X_train = pd.DataFrame({'age': [25, 30, 35, 40], 'income': [50000, 60000, 70000, 80000])
    y_train = pd.Series([0, 0, 1, 1])

    pipeline = Pipeline([('preprocessing', DataPreprocessor()),  # Set up pipeline
                         ('model', LogisticRegression())])
    pipeline.fit(X_train, y_train)  # Fit pipeline on training data

    # Generate mock data for testing
    X_test = pd.DataFrame({'age': [30, 35, 40, 45], 'income': [55000, 65000, 75000, 85000])
    y_test = pd.Series([0, 0, 1, 1])
    y_pred = pipeline.predict(X_test)  
    accuracy = accuracy_score(y_test, y_pred)  # Evaluate pipeline on test data

    assert accuracy > 0.8, "Error: pipeline accuracy is too low."
Developing Machine Learning Models for Production

Monitoring model staleness

  • Model staleness - model's performance decreases over time
    • change in data or environment
  • Continuous Monitoring!

confused robot

Developing Machine Learning Models for Production

Identifying and addressing model staleness

Identifying

  • Monitoring model performance
  • Monitor changes in data + environment

Addressing

  • Re-training the model on new data
    • e.g. New feature needs to be included in the model
  • Update data pipeline to account for environment changes
    • e.g. Changes in analytics platforms confuses your pipeline
Developing Machine Learning Models for Production

Let's practice!

Developing Machine Learning Models for Production

Preparing Video For Download...