About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Last updated: Nov 21, 2024
To create custom evaluations, select a set of custom metrics to quantitatively track your model deployment and business application. You can define these custom metrics and use them alongside metrics that are generated by other types of evaluations.
You can use one of the following methods to manage custom evaluations and metrics:
Managing custom metrics with the Python SDK
To manage custom metrics with the Python SDK, you must perform the following tasks:
The following advanced tutorial shows how to do this:
You can disable and enable again custom monitoring at any time. You can remove custom monitor if you do not need it anymore.
For more information, see the Python SDK documentation.
Step 1: Register custom monitor with metrics definition.
Before you can start by using custom metrics, you must register the custom monitor, which is the processor that tracks the metrics. You also must define the metrics themselves.
- Use the
method to import theget_definition(monitor_name)
andMetric
objects.Tag
- Use the
method to define the metrics, which requiremetrics
,name
, andthresholds
values.type
- Use the
method to define metadata.tags
The following code is from the working sample notebook that was previously mentioned:
def get_definition(monitor_name):
monitor_definitions = wos_client.monitor_definitions.list().result.monitor_definitions
for definition in monitor_definitions:
if monitor_name == definition.entity.name:
return definition
return None
monitor_name = 'my model performance'
metrics = [MonitorMetricRequest(name='sensitivity',
thresholds=[MetricThreshold(type=MetricThresholdTypes.LOWER_LIMIT, default=0.8)]),
MonitorMetricRequest(name='specificity',
thresholds=[MetricThreshold(type=MetricThresholdTypes.LOWER_LIMIT, default=0.75)])]
tags = [MonitorTagRequest(name='region', description='customer geographical region')]
existing_definition = get_definition(monitor_name)
if existing_definition is None:
custom_monitor_details = wos_client.monitor_definitions.add(name=monitor_name, metrics=metrics, tags=tags, background_mode=False).result
else:
custom_monitor_details = existing_definition
To check how you're doing, run the
command to see whether your newly created monitor and metrics are configured properly.client.data_mart.monitors.list()
You can also get the monitor ID by running the following command:
custom_monitor_id = custom_monitor_details.metadata.id
print(custom_monitor_id)
For a more detailed look, run the following command:
custom_monitor_details = wos_client.monitor_definitions.get(monitor_definition_id=custom_monitor_id).result
print('Monitor definition details:', custom_monitor_details)
Step 2: Enable custom monitor.
Next, you must enable the custom monitor for subscription. This activates the monitor and sets the thresholds.
- Use the
method to import thetarget
object.Threshold
- Use the
method to set the metricthresholds
value. Supply thelower_limit
value as one of the parameters. If you don't remember, you can always use themetric_id
command to get the details as shown in the previous example.custom_monitor_details
The following code is from the working sample notebook that was previously mentioned:
target = Target(
target_type=TargetTypes.SUBSCRIPTION,
target_id=subscription_id
)
thresholds = [MetricThresholdOverride(metric_id='sensitivity', type = MetricThresholdTypes.LOWER_LIMIT, value=0.9)]
custom_monitor_instance_details = wos_client.monitor_instances.create(
data_mart_id=data_mart_id,
background_mode=False,
monitor_definition_id=custom_monitor_id,
target=target
).result
To check on your configuration details, use the
command.subscription.monitoring.get_details(monitor_uid=monitor_uid)
Step 3: Store metric values.
You must store, or save, your custom metrics to the region where your service instance exists.
- Use the
method to set which metrics you are storing.metrics
- Use the
method to commit the metrics.subscription.monitoring.store_metrics
The following code is from the working sample notebook that was previously mentioned:
from datetime import datetime, timezone, timedelta
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import MonitorMeasurementRequest
custom_monitoring_run_id = "11122223333111abc"
measurement_request = [MonitorMeasurementRequest(timestamp=datetime.now(timezone.utc),
metrics=[{"specificity": 0.78, "sensitivity": 0.67, "region": "us-south"}], run_id=custom_monitoring_run_id)]
print(measurement_request[0])
published_measurement_response = wos_client.monitor_instances.measurements.add(
monitor_instance_id=custom_monitor_instance_id,
monitor_measurement_request=measurement_request).result
published_measurement_id = published_measurement_response[0]["measurement_id"]
print(published_measurement_response)
To list all custom monitors, run the following command:
published_measurement = wos_client.monitor_instances.measurements.get(monitor_instance_id=custom_monitor_instance_id, measurement_id=published_measurement_id).result
print(published_measurement)
Managing custom metrics with watsonx.governance
Step 1: Add metric groups
- On the Configure tab, click Add metric group.
- If you want to configure a metric group manually, click Configure new group.
a. Specify a name and a description for the metric group.
The length of the name that you specify must be less than or equal to 48 characters.
b. Click the Edit icon on the Input parameters tile and the specify the details for your input parameters.
The parameter name that you specify must match the parameter name that is specified in the metric API.
c. If the parameter is required to configure your custom monitor, select the Required parameter checkbox.
d. Click Add.
After you add the input parameters, click Next.
e. Select the model types that your evaluation supports and click Next.
f. If you don't want to specify an evaluation schedule, click Save.
g. If you want to specify an evaluation schedule, click the toggle.
You must specify the interval for the evaluation schedule and click Save. h. Click Add metric and specify the metric details.
Click Save. - If you want to configure a metric group by using a JSON file, click Import from file.
Upload a JSON file and click Import.
Step 2: Add metric endpoints
- In the Metric endpoints section, click Add metric endpoint.
- Specify a name and a description for the metric endpoint.
- Click the Edit icon on the Connection tile and specify the connection details.
Click Next. - Select the metric groups that you want associate with the metric endpoint and click Save.
Step 3: Configure custom monitors
- On the Insights Dashboard page, select Configure monitors on a model deployment tile.
- In the Evaluations section, select the name of the metric group that you added.
- Select the Edit icon on the Metric endpoint tile.
- Select a metric endpoint and click Next.
If you don't want to use a metric endpoint, select None. - Use the toggles to specify the metrics that you want to use to evaluate the model and provide threshold values.
Click Next. - Specify values for the input parameters. If you selected JSON as the data type for the metric group, add the JSON data.
Click Next.
You can now evaluate models with a custom monitor.
Accessing and visualizing custom metrics
To access and visualize custom metrics, you can use programmatic interface. The following advanced tutorial shows how to do this:
-
Working with IBM watsonx.ai Runtime
For more information, see the Python SDK documentation.
Visualization of your custom metrics appears on the Insights Dashboard.
Learn more
Parent topic: Configuring model evaluations