Watson OpenScale quality metrics
When you enable quality evaluations in Watson OpenScale, you can generate metrics that help you determine how well your model predicts outcomes.
You can view the results of your quality evaluations on the Insights dashboard in Watson OpenScale. To view results, you can select a model deployment tile and click the arrow in the Quality evaluation section to display a summary of quality metrics from your last evaluation. For more information, see Reviewing quality results.
Quality metrics are calculated with manually labeled feedback data and monitored deployment responses. For more information, see Managing feedback data.
Supported quality metrics
The following quality metrics are supported by Watson OpenScale:
Binary classification problems
For binary models, Watson OpenScale tracks when the quality of the model falls below an acceptable level. For binary classification models, it checks the Area under ROC score, which measures the model's ability to distinguish between two classes. For example, the models with higher Area under ROC scores are better at identifying class A as class A and class B as class B. The following metrics measure binary classification problems:
Regression classification problems
For regression models, Watson OpenScale tracks when the quality of the model falls below an acceptable level and checks the R-squared score. The R-squared score measures the correlation between predicted values and actual values. For example, models with higher R-squared scores fit to the actual values better. The following metrics measure regression classification problems:
Multiclass classification problems
For multi-classification models, Watson OpenScale tracks when the quality of the model falls under an acceptable level and checks the Accuracy score that provides the percentage of accurate predictions. The following metrics measures multiclass classification problems:
- Accuracy
- Weighted True Positive Rate (wTPR)
- Weighted False Positive Rate (wFPR)
- Weighted recall
- Weighted precision
- Weighted F1-Measure
- Logarithmic loss
After Watson OpenScale detects problems with quality, such as accuracy threshold violations, you must build a new version of the model that fixes the problem. Using the manually labeled data in the feedback table, you must retrain the model along with the original training data.
Learn more
Parent topic: Configuring quality evaluations