Configuring drift v2 evaluations

Last updated: Mar 14, 2025
Configuring drift v2 evaluations

You can configure drift v2 evaluations to measure changes in your data over time to ensure consistent outcomes for your model. Use drift v2 evaluations to identify changes in your model output, the accuracy of your predictions, and the distribution of your input data.

The following sections describe how to configure drift v2 evaluations:

Configuring drift v2 evaluations

If you log payload data when you prepare for model evaluations, you can configure drift v2 evaluations to help you understand how changes in your data affect model outcomes.

Compute the drift archive

You must choose the method that is used to analyze your training data to determine the data distributions of your model features. If you connect training data and the size of your is less than 500 MB, you can choose to compute the drift v2 archive.

If you don't connect your training data, or if the size of your data is larger than 500 MB, you must choose to compute the drift v2 archive in a notebook. You must also compute the drift v2 archive in notebooks if you want to evaluate image or text models.

You can specify a limit for the size of your training data by setting maximum sample sizes for the amount of training data that is used for scoring and computing the drift v2 archive. For non-watsonx.ai Runtime deployments, computing the drift v2 archive has a cost associated with scoring the training data against your model's scoring endpoint.

Set drift thresholds

You must set threshold values for each metric to identify issues with your evaluation results. The values that you set create alerts on the Insights dashboard that appear when metric scores violate your thresholds. You must set the values between the range of 0 to 1. The metric scores must be lower than the threshold values to avoid violations.

Select important features

For tabular models only, Feature importance is calculated to determine the impact of feature drift on your model. To calculate feature importance, you can select the important and most important features from your model that have the biggest impact on your model outcomes.

When you configure SHAP explanations, the important features are automatically detected by using global explanations.

You can also upload a list of important features by uploading a JSON file. Sample snippets are provided that you can use to upload a JSON file. For more information, see Feature importance snippets.

Set sample size

Sample sizes are provided to process the number of transactions that are evaluated during evaluations. You must set a minimum sample size to indicate the lowest number of transactions that you want to evaluate. You can also set a maximum sample size to indicate the maximum number of transactions that you want to evaluate.

Parent topic: Evaluating AI models