Watson OpenScale model health monitor evaluation metrics
Watson OpenScale enables model health monitor evaluations by default for production model deployments to help you understand your model behavior and performance. You can use model health metrics to determine how efficiently your model deployment processes your transactions.
You can view the results of your model health evaluations on the Insights dashboard in Watson OpenScale. To view results, you can select a model deployment tile and click the arrow in the Model health evaluation section to display a summary of model health metrics from your last evaluation. For more information, see Reviewing evaluation results.
When model health evaluations are enabled, Watson OpenScale creates a model health data set in the Watson OpenScale data mart. The model health data set stores details about your scoring requests that Watson OpenScale uses to calculate model health metrics.
Watson OpenScale does not support model health evaluations for batch deployments.
Supported model health metrics
The following metric categories for model health evaluations are supported by Watson OpenScale. Each category contains metrics that provide details about your model performance:
-
Watson OpenScale calculates the total, average, minimum, maximum, and median payload size of the transaction records that your model deployment processes across scoring requests in kilobytes (KB). Watson OpenScale does not support payload size metrics for image models.
-
Watson OpenScale calculates the total, average, minimum, maximum, and median number of transaction records that are processed across scoring requests during model health evaluations.
-
Watson OpenScale calculates the number of scoring requests that your model deployment receives during model health evaluations.
-
Watson OpenScale calculates latency by tracking the time that it takes to process scoring requests and transaction records per millisecond (ms). Throughput is calculated by tracking the number of scoring requests and transaction records that are processed per second.
To calculate throughput and latency, Watson OpenScale uses the
response_time
value from your scoring requests to track the time that your model deployment takes to process scoring requests.For Watson Machine Learning deployments, Watson OpenScale automatically detects the
response_time
value when you configure evaluations.For external and custom deployments, you must specify the
response_time
value when you send scoring requests to calculate throughput and latency as shown in the following example from the Watson OpenScale Python SDK:from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord client.data_sets.store_records( data_set_id=payload_data_set_id, request_body=[ PayloadRecord( scoring_id=<uuid>, request=openscale_input, response=openscale_output, response_time=<response_time>, user_id=<user_id>) ] )
Watson OpenScale calculates the following metrics to measure thoughput and latency during evaluations:
- API latency: Time taken (in ms) to process a scoring request by your model deployment.
- API throughput: Number of scoring requests processed by your model deployment per second
- Record latency: Time taken (in ms) to process a record by your model deployment
- Record throughput: Number of records processed by your model deployment per second
Watson OpenScale calculates the average, maximum, median, and minimum throughput and latency for scoring requests and transaction records.
-
Watson OpenScale calculates the number of users that send scoring requests to your model deployments.
To calculate the number of users, Watson OpenScale uses the
user_id
from scoring requests to identify the users that send the scoring requests that your model receives.For Watson Machine Learning deployments, Watson OpenScale automatically detects the
user_id
value when you configure evaluations.For external and custom deployments, you must specify the
user_id
value when you send scoring requests to calculate the number of users as shown in the following example from the Watson OpenScale Python SDK:from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord client.data_sets.store_records( data_set_id=payload_data_set_id, request_body=[ PayloadRecord( scoring_id=<uuid>, request=openscale_input, response=openscale_output, response_time=<response_time>, user_id=<user_id>). --> value to be supplied by user ] )
When you view the Users metric in Watson OpenScale, use the real-time view to see the total number of users and the aggregated views to see the average number of users. For more information, see Reviewing model health results.