Accuracy in Watson OpenScale quality metrics

Last updated: Jul 27, 2023

Accuracy is a measure of the proportion of correct predictions within your model in Watson OpenScale.

Accuracy at a glance

Description: The proportion of correct predictions
Default thresholds: Lower limit = 80%
Default recommendation:
- Upward trend: An upward trend indicates that the metric is improving. This means that model retraining is effective.
- Downward trend: A downward trend indicates that the metric is deteriorating. Feedback data is becoming significantly different than the training data.
- Erratic or irregular variation: An erratic or irregular variation indicates that the feedback data is not consistent between evaluations. Increase the minimum sample size for the Quality monitor.
Problem types: Binary classification and multiclass classification
Chart values: Last value in the timeframe
Metrics details available: Confusion matrix

Understanding Accuracy

Accuracy can mean different things depending on the type of the algorithm:

Multi-class classification: Accuracy measures the number of times any class was predicted correctly, normalized by the number of data points. For more details, see Multi-class classification in the Apache Spark documentation.
Binary classification: For a binary classification algorithm, accuracy is measured as the area under an ROC curve. See Binary classification in the Apache Spark documentation for more details.
Regression: Regression algorithms are measured by using the Coefficient of Determination, or R2. For more details, see Regression model evaluation in the Apache Spark documentation.

Do the math

Accuracy is defined as the number of true positives and negatives that are divided by the sum of the true positives and negatives and the sum of false positives and negatives.

                                     number of true positives + number of true negatives
Accuracy =   ________________________________________________________________________________________________________________

              (number of true positives + number of true negatives + number of false positives + number of false negatives)

How it works

You need to add manually labeled feedback data through the Watson OpenScale UI as shown in the following examples, by using a Python client or Rest API.

Debiased accuracy

When there is data to support it, the accuracy is computed on both the original and debiased model. IBM Watson OpenScale computes the accuracy for the debiased output and stores it in the payload logging table as an extra column.

Learn more

Reviewing quality results

Parent topic: Quality metrics overview