0 / 0
Testing AI service deployments
Last updated: Nov 08, 2024
Testing AI service deployments

AI services provide online inferencing capabilities, which you can use for real-time inferencing on new data. For use cases that require batch scoring, you can run a batch deployment job to test your AI service deployment.

Methods for testing AI service deployments

AI services provide online inferencing capabilities, which you can use for real-time inferencing on new data. This inferencing capability is useful for applications that require real-time predictions, such as chatbots, recommendation systems, and fraud detection systems.

To test AI services for online inferencing, you can use the watsonx.ai user interface, Python client library, or REST API. For more information, see Inferencing AI service deployments for online scoring.

To test AI services for inferencing streaming applications, you can use the watsonx.ai Python client library or REST API. For more information, see Inferencing AI service deployments for streaming applications.

For use cases that require batch scoring, create and run a batch deployment job. You can provide any valid JSON input as a payload to the REST API endpoint for inferencing. For more information, see Testing batch deployments for AI services.

Testing AI service deployments for online scoring

You can inference your AI service online deployment by using the watsonx.ai user interface, REST API or watsonx.ai Python client library.

Inferencing with the user interface

You can use the user interface for inferencing your AI services by providing any valid JSON input payload that aligns with your model schema. Follow these steps for inferencing your AI service deployment:

  1. In your deployment space or your project, open the Deployments tab and click the deployment name.

  2. Click the Test tab to input prompt text and get a response from the deployed asset.

  3. Enter test data in JSON format to generate output in JSON format.

    Restriction:

    You cannot enter input data in the text format to test your AI service deployment. You must use the JSON format for testing.

    Inferencing online deployment for AI services from the user interface

Inferencing with Python client library

Use the run_ai_service function to run an AI service by providing a scoring payload:

ai_service_payload = {"query": "testing ai_service generate function"}

client.deployments.run_ai_service(deployment_id, ai_service_payload)

For more information, see watsonx.ai Python client library documentation for running an AI service asset.

Inferencing with REST API

Use the /ml/v4/deployments/{id_or_name}/ai_service REST endpoint for inferencing AI services for online scoring (AI services that include the generate() function):

# Any valid json is accepted in POST request to /ml/v4/deployments/{id_or_name}/ai_service
payload = {
    "mode": "json-custom-header",
    't1': 12345 ,
    't2': ["hello", "wml"],
    't3': {
        "abc": "def",
        "ghi": "jkl"
    }
}

response = requests.post(
    f'{HOST}/ml/v4/deployments/{dep_id}/ai_service?version={VERSION}',
    headers=headers,
    verify=False,
    json=payload
)

print(f'POST {HOST}/ml/v4/deployments/{dep_id}/ai_service?version={VERSION}', response.status_code)

from json.decoder import JSONDecodeError
try:
    print(json.dumps(response.json(), indent=2))
except JSONDecodeError:
    print(f"response text >>\n{response.text}")
    
print(f"\nThe user customized content-type >> '{response.headers['Content-Type']}'")

Testing AI service deployments for streaming applications

You can inference your AI service online deployment for streaming applications by using the watsonx.ai REST API or watsonx.ai Python client library.

Inferencing with Python client library

Use the run_ai_service_stream function to run an AI service for streaming applications by providing a scoring payload.

ai_service_payload = {"sse": ["Hello", "WatsonX", "!"]}

for data in client.deployments.run_ai_service_stream(deployment_id, ai_service_payload):
    print(data)

For more information, see watsonx.ai Python client library documentation for running an AI service asset for streaming.

Inferencing with REST API

Use the /ml/v4/deployments/{id_or_name}/ai_service_stream REST endpoint for inferencing AI services for streaming application (AI services that include the generate_stream() function):

TOKEN=f'{client._get_icptoken()}'
%%writefile payload.json
{
    "sse": ["Hello" ,"WatsonX", "!"]
}
!curl -N \
    -H "Content-Type: application/json" -H  "Authorization: Bearer {TOKEN}" \
    {HOST}'/ml/v4/deployments/'{dep_id}'/ai_service_stream?version='{VERSION} \
    --data @payload.json

Output

id: 1 event: message data: Hello

id: 2 event: message data: WatsonX

id: 3 event: message data: !

id: 4 event: eos

Testing batch deployments for AI services

You can use the watsonx.ai user interface, Python client library or REST API to create and run a batch deployment job for if you created a batch deployment for your AI service (AI service asset that contains the generate_batch() function).

Prerequisites

You must set up your task credentials to run a batch deployment job for your AI service. For more information, see Managing task credentials.

Testing with user interface

To learn more about creating and running a batch deployment job for your AI service from the deployment space user interface, see Creating jobs in deployment spaces.

Testing with Python client library

Follow these steps to test your batch deployment by creating and running a job:

  1. Create a job to test your AI service batch deployment:

    batch_reference_payload = {
        "input_data_references": [
            {
                "type": "connection_asset",
                "connection": {"id": "2d07a6b4-8fa9-43ab-91c8-befcd9dab8d2"},
                "location": {
                    "bucket": "wml-v4-fvt-batch-pytorch-connection-input",
                    "file_name": "testing-123",
                },
            }
        ],
        "output_data_reference": {
            "type": "data_asset",
            "location": {"name": "nb-pytorch_output.zip"},
        },
    }
    
    job_details = client.deployments.create_job(deployment_id, batch_reference_payload)
    job_uid = client.deployments.get_job_uid(job_details)
    print("The job id:", job_uid)
    
  2. Check job status:

    ##  Get job status
    client.deployments.get_job_status(job_uid)
    

Learn more

Parent topic: Deploying AI services

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more