0 / 0
Configuring drift evaluations in Watson OpenScale

Configuring drift evaluations in Watson OpenScale

Watson OpenScale drift evaluations detect drops in accuracy and data consistency in a model. The model accuracy drops if there is an increase in transactions similar to those that the model did not evaluate correctly in the training data.

Drift evaluation examples

When configuring Drift in Watson OpenScale, you have to specify the tolerable accuracy drift magnitude. The drift is measured as the drop in accuracy as compared to the model accuracy at training time. For example, if the model accuracy at training time was 90% and at runtime the estimated accuracy of the model is 80%, then the model is said to have drifted by 10%. Depending on the use case, model owners will be willing to tolerate different amounts of drift. Hence Watson OpenScale allows you to specify the accuracy drift magnitude (called as Drift alert threshold) for each model being monitored in Watson OpenScale. If the drift for a model drops below the specified threshold, an alert will be generated.

If you use IBM Watson Machine Learning and your data does not exceed 500 MB, you can train your model online by using Watson OpenScale. Otherwise, you must use a notebook to train the model.

Before you begin

You must configure drift detection before it can analyze your model. You can train your drift detection model online by using the user interface or by running code inside a notebook. Drift configuration is supported for structured data only. The classification models support both data and accuracy drift, regression models support only data drift.

These are the requirements for configuring the Drift monitor:

  • The Machine Learning Provider must be Watson Machine Learning
  • The training data size must be less than 500MB
  • The training data must be hosted in IBM Cloud Object Storage/Db2.

To upload the training data and set the Model details for drift detection:

  • Click Upload training data and upload a file with the labeled data.

For details, see Providing model details.

Throughout this process, IBM Watson OpenScale analyzes your model and makes recommendations based on the most logical outcome. For drift detection to work properly, the data type of your prediction column in the training data must match the data type of the same column in the payload data. Assign matching string or numeric types to the prediction and label columns. To confirm data types, click Model details > Model output details > Edit. These selections ensure that you have accurate information for the following configuration steps. If for some reason you must change data types, you must redeploy the evaluation to effect the changes.

On the successive pages of the Drift tab, you must provide the following information:

Alert threshold

Required only for classification type models: IBM Watson OpenScale tracks the degree of change in model accuracy when compared to accuracy at training time. The alert threshold, which must be at least 5%, indicates the degree of tolerance for change over time.

Sample size

By setting a minimum sample size, you prevent measuring drift until a minimum number of records are available in the evaluation data set. This setting ensures that the sample size is not too small to skew results. Every time drift checking runs, it uses the minimum sample size to decide the number of records on which it does the computation.

Steps to configure drift evaluation

If you use IBM Watson Machine Learning, you have the option of using the Watson OpenScale user interface to configure drift detection.

To start the configuration process, from the Drift tab, in the Drift model box, click the Edit The edit icon icon. Use the Train in Watson OpenScale option.

Follow the prompts and enter required information. When you finish, a summary of your selections is presented for review. If you want to change anything, click the Edit icon for that section. Otherwise, save your work.

Steps to configure drift without retraining

Reconfigure the drift evaluation without retraining the drift model to update parameters without more processing. You update the minimum sample size and threshold to produce more data on the currently trained model without incurring more processing costs. It is one way to avoid intensive CPU usage when the underlying data is stable and you want to view drift magnitude with different thresholds.

Note: Your drift model requires retraining only when training data or schema changes.

To start the configuration process, from the Drift tab, in the Drift threshold box or Sample size box, click Edit The edit icon. Update the current setting and save it.

Steps to configure drift by using a notebook

Use a notebook to configure drift in the following circumstances:

  • You do not want to share the training data with Watson OpenScale.
  • You do not have a means to share the training data on Db2 or IBM Cloud Object Storage, which are the only two training data locations that are supported by Watson OpenScale.

This option is useful if the training data is not stored in Db2 or IBM Cloud Object Storage. Using a notebook, you must read the training data into a dataframe. The specialized notebook that you can download from Watson OpenScale then creates a specialized output that you can upload to Watson OpenScale.

To generate the drift detection model, you can run the cell that installs the ibm-wos-utils>=4.5.0 package and sci-kit learn version 1.0.2. Scikit-learn version 1.0.2 is required to build the model.

Create a notebook to generate the drift detection model. Use the sample notebook that is available in Watson OpenScale. The drift detection model is converted into a .tar.gz file for you.

To start the configuration process, from the Drift tab, in the Drift model box, click Edit The edit icon. Use the Train in a data science notebook option. You can drag your compressed drift detection model to the drop zone.

Follow the prompts and enter required information. When you finish, a summary of your selections is presented for review. If you want to change anything, click Edit for that section. Otherwise, save your work.

Learn more

Drift metrics

Parent topic: Configuring model evaluations

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more