Output drift in Watson OpenScale drift v2 metrics

Last updated: Jul 27, 2023
Output drift in Watson OpenScale drift v2 metrics

Watson OpenScale calculates output drift by measuring the change in the model confidence distribution.

How it works

Watson OpenScale measures how much your model output changes from the time that you train the model. For regression models, Watson OpenScale calculates output drift by measuring the change in distribution of predictions on the training and payload data. For classification models, Watson OpenScale calculates output drift for each class probability by measuring the change in distribution for class probabilities on the training and payload data. For multi-classification models, Watson OpenScale also aggregates output drift for each class probability by measuring a weighted average.

Do the math

Watson OpenScale uses the following formulas to calculate output drift:

Total variation distance

Total variation distance measures the maximum difference between the probabilities that two probability distributions, baseline (B) and production (P), assign to the same transaction as shown in the following formula:

Probability distribution formula is displayed

If the two distributions are equal, the total variation distance between them becomes 0.

Watson OpenScale uses the following formula to calculate total variation distance:

Total variation distance formula is displayed

  • π‘₯ is a series of equidistant samples that span the domain of circumflex f is displayed that range from the combined miniumum of the baseline and production data to the combined maximum of the baseline and production data.

  • d(x) symbol is displayed is the difference between two consecutive π‘₯ samples.

  • explanation of formula is the value of the density function for production data at a π‘₯ sample.

  • explanation of formula is the value of the density function for baseline data for at a π‘₯ sample.

The explanation of formula denominator represents the total area under the density function plots for production and baseline data. These summations are an approximation of the integrations over the domain space and both these terms should be 1 and total should be 2.

Overlap coefficient

Watson OpenScale calculates the overlap coefficient by measuring the total area of the intersection between two probability distributions. To measure dissimilarity between distributions, the intersection or the overlap area is subtracted from 1 to calculate the amount of drift. Watson OpenScale uses the following formula to calculate the overlap coefficient:

Overlap coefficient formula is displayed

  • π‘₯ is a series of equidistant samples that span the domain of circumflex f is displayed that range from the combined miniumum of the baseline and production data to the combined maximum of the baseline and production data.

  • d(x) symbol is displayed is the difference between two consecutive π‘₯ samples.

  • explanation of formula is the value of the density function for production data at a π‘₯ sample.

  • explanation of formula is the value of the density function for baseline data for at a π‘₯ sample.

Learn more

Reviewing drift v2 results

Parent topic: Drift v2 metrics