Drop in accuracy evaluation metric

Last updated: Feb 12, 2025

The drop in accuracy metric estimates the drop in accuracy of your model at run time when compared to the training data.

Metric details

Drop in accuracy is a drift evaluation metric that can help determe how well your model predicts outcomes over time.

Scope

The drop in accuracy metric evaluates machine learning models only.

Types of AI assets: Machine learning models
Machine learning problem type:
- Binary classification
- Multiclass classification

Scores and values

The drop in accuracy metric score indicates if there is an increase in transactions similar to those that the model did not evaluate correctly in the training data.

Range of values: 0.0-1.0

Evaluation process

The drift monitor works differently in pre-production and production environments.

In pre-production environments, when you upload labeled test data, the data is added to the feedback and payload tables. The labeled data is added as an annotation in the payload table. Accuracy is calculated with the labeled data column and the prediction column from the payload table.

In production environments, A drift detection model is created by looking at the data that was used to train and test the model. For example, if the model has an accuracy of 90% on the test data, it means that it provides incorrect predictions on 10% of the test data. A binary classification model is built that accepts a data point and predicts whether that data point is similar to the data that the model either incorrectly (10%) or accurately (90%) predicted.

After the drift detection model is created, at run time, this model is scored by using all of the data that the client model receives. For example, if the client model received 1000 records in the past 3 hours, the drift detection model runs on those same 1000 data points. It calculates how many of the records are similar to the 10% of records on which the model made an error when training. If 200 of these records are similar to the 10%, then it implies that the model accuracy is likely to be 80%. Because the model accuracy at training time was 90%, it means that there is an accuracy drift of 10% in the model.

To calculate the drop in accuracy metric, each transaction is analyzed to estimate if the model prediction is accurate. If the model prediction is inaccurate, the transaction is marked as drifted. The estimated accuracy is then calculated as the fraction of nondrifted transactions to the total number of transactions analyzed. The base accuracy is the accuracy of the model on the test data. The extent of the drift in accuracy is calculated as the difference between base accuracy and estimated accuracy. All of the drifted transactions are calculated and then, transactions are grouped based on the similarity of each feature's contribution to the drift in accuracy. In each cluster, the important features are estimated that contributed to the drift in accuracy and their feature impact is classified as large, some, and small.

Next steps

To mitigate drift after it is detected, you must build a new version of the model that fixes the problem. A good place to start is with the data points that are highlighted as reasons for the drift. Introduce the new data to the predictive model after you manually label the drifted transactions and use them to retrain the model.

Note:

The actions that you take to improve or validate performance are not prescribed and depend on your model use case and the goals that you want to achieve. The effectivenss of each approach might vary based on your implementation and requirements.

Parent topic: Evaluation metrics

Was the topic helpful?

0/1000