To calculate the record latency metric, response_time value from your scoring requests is used to track the time that your model deployment takes to process scoring requests.
For watsonx.ai Runtime deployments, the response_time value is automatically detected when you configure evaluations.
For external and custom deployments, you must specify the response_time value when you send scoring requests to calculate throughput and latency as shown in the following example from the Python ソフトウェア開発キット: