The output drift metric measures the change in model confidence distribution.
Metric details
Output drift is a drift v2 evaluation metric that evaluates data distribution changes.
Scope
The output drift metric evaluates machine learning models and generative AI assets.
Types of AI assets:
- Machine learning models
- Prompt templates
Scores and values
The output drift metric score indicates the amount that your model output changes from the time that you train the model.
- Best possible score: 0.0
- Ratios:
- At 0: No change in model output
- Over 0: Increasing change in model output
Evaluation process
For regression models, output drift is calculated by measuring the change in distribution of predictions on the training and payload data. For classification models, output drift is calculated for each class probability by measuring the change in distribution for class probabilities on the training and payload data. For multi-classification models, output drift is aggregated for each class probability by measuring a weighted average.
Do the math
The following formulas are used to calculate the output drift metric:
Total variation distance measures the maximum difference between the probabilities that two probability distributions, baseline (B) and production (P), assign to the same transaction as shown in the following formula:
If the two distributions are equal, the total variation distance between them becomes 0.
The following formula is used to calculate total variation distance:
-
𝑥 is a series of equidistant samples that span the domain of
that range from the combined miniumum of the baseline and production data to the combined maximum of the baseline and production data.
-
is the difference between two consecutive 𝑥 samples.
-
is the value of the density function for production data at a 𝑥 sample.
-
is the value of the density function for baseline data for at a 𝑥 sample.
The denominator represents the total area under the density function plots for production and baseline data. These summations are an approximation of the integrations
over the domain space and both these terms should be 1 and total should be 2.
The overlap coefficient is calculated by measuring the total area of the intersection between two probability distributions. To measure dissimilarity between distributions, the intersection or the overlap area is subtracted from 1 to calculate the amount of drift. The following formula is used to calculate the overlap coefficient:
-
𝑥 is a series of equidistant samples that span the domain of
that range from the combined miniumum of the baseline and production data to the combined maximum of the baseline and production data.
-
is the difference between two consecutive 𝑥 samples.
-
is the value of the density function for production data at a 𝑥 sample.
-
is the value of the density function for baseline data for at a 𝑥 sample.
Parent topic: Evaluation metrics