When labels are available

Monitoring Machine Learning in Python

Maciej Balawejder

Data Scientist

Estimated vs realized performance

Estimated performance:

measures how well model is expected to perform
determined using estimators like CBPE, and DLE
estimated when ground truth is not available

Realized performance:

represents measured performance
determined using performance calculator
calculated when ground truth is available

Delayed ground truth

The image displays a time axis with three points for each week. Every Monday, the model's realized performance is evaluated, while the CBPE estimator is used to estimate performance between Mondays.

Performance calculator

# Intialize the calculator
calc = nannyml.PerformanceCalculator(
    y_pred_proba='y_pred_proba',
    y_pred='y_pred',
    y_true='arrived',
    timestamp_column_name='timestamp',
    problem_type='classification_binary',
    chunk_period='d',
    metrics=['roc_auc', 'accuracy'],
    )

# Fit the calculator
calc.fit(reference)
realized_results = calc.calculate(analysis)

Plot the results

# Show realized performance plot
results.plot().show()

The image illustrates the realized ROC AUC graph, Showing a performance drop from April 2019 to August 2019.

Realized and estimated performance

# Estimate and calculate results
estimated_results = estimator.estimate(analysis)
realized_results = calculator.calculate(analysis)


# Show comparison plot
realized_results.compare(estimated_results).plot().show()

Realized and estimated performance

The plot compares the realized and estimated performance for the ROC AUC metric.

Let's practice!

Monitoring Machine Learning in Python