Deploying AI services with tools

Last updated: Jan 30, 2025

When you use visual tools to build a generative AI application for a complex use case such as Retrieval Augmented Generation (RAG) or agentic AI, your application is deployed as an AI service. You can deploy your application as an AI service directly by using visual tools. Alternatively, you can save your work in a deployment notebook when you use the Prompt Lab or Agent Lab to create your solution.

Process overview

When you use visual tools to build a generative AI solution for a complex use case, you deploy your solution as an AI service. When you deploy AI services from tools, you create an online deployment without needing to write any code. In contrast, to create a batch deployment, you must save your work as a notebook, write code for a generation function, and then create a deployment programmatically. For more information, see Deploying AI service assets.

The AI service captures the logic for your use case and contains the generation function, which is a deployable unit of code. The deployment is exposed as a REST API endpoint that can be accessed by other applications. You can send a request to the REST API endpoint for using the deployed AI service for inferencing. The deployed AI service processes the request and returns a response. For example, you can create the following types of AI solutions with the tools and then deploy them as AI services:

Retrieval-augmented generation (RAG) solution with the Prompt Lab or AutoAI
AI agent with the Agent Lab

When you deploy an AI service from the Prompt Lab or Agent Lab, you have these options:

Deploy directly: Use this option if your solution is complete and you don’t want to make further changes.
Deployment notebook: Use this option if you want to customize your solution by adding or altering the code.

When you deploy an AI service from AutoAI, you deploy it directly.

Tasks for deploying AI services with visual tools

Here are the steps that you must follow to create, deploy, and manage AI services with visual tools:

Deploy AI service: Depending on the tool that you used to build your generative AI use case, you can deploy your solution directly from the visual tool if you used the Agent Lab, Prompt Lab, or AutoAI. If you use the Agent Lab or Prompt Lab to build your solution, you can also save your solution in a deployment notebook that contains the AI service. Choose a method that is best suited for your use case.
Testing AI service deployment: Test your deployed AI service for online inferencing or batch scoring.
Manage AI services: Access and update deployment details. Scale or delete the deployment from the user interface or programmatically.

Deploying AI services with Agent Lab

When you use the Agent Lab to build an agent and deploy your work as an AI service, the logic for your agentic AI application is automatically captured in an AI service asset and an online deployment is created automatically for the asset.

Before you begin

You must have an existing target deployment space or create a new one where you want to deploy your AI service asset.
Build an agentic AI solution in the Agent Lab from your project. For more information, see Agent Lab (beta).
You must set up your task credentials by generating an API key. For more information, see Managing task credentials.

Deploying AI services directly

Follow these steps to create an online deployment for an AI service from the Agent Lab tool:

To deploy your work from the Agent Lab as an AI service, click Deploy.
Enter your deployment details, choose your deployment space, and click Create.

This procedure automatically creates an online deployment for your AI service asset in your project or deployment space. To create a batch deployment for your AI service asset, you must follow the process to manually create a batch deployment from your deployment space. For more information, see Deploying AI service assets.

Deploying AI services with a deployment notebook in Agent Lab

To customize the programming logic of your agentic AI application, you can use the Agent Lab to save your work in a deployment notebook. When you save your work in a deployment notebook, watsonx.ai automatically generates a deployment notebook which captures the logic of your agentic AI application in an AI service.

The deployment notebook contains auto-generated code to promote your AI service asset to a deployment space and create a deployment for the asset. You can edit the deployment notebook for customization, such as creating a batch deployment to deploy an AI service asset instead of an online deployment for your use case.

To save your work in a deployment notebook that contains an AI service from the watsonx.ai Prompt lab, follow these steps:

Work with the Agent Lab to build an agentic AI solution with Agents.
Click the Save icon and select Save as from the dropdrown menu.
In the Save you work dialog box, select Deployment notebook.
Note: The deployment notebook contains the code to test, promote, and deploy an AI service. To deploy your application, you must save your work in a deployment notebook. You cannot use a standard notebook to deploy an AI service asset.
In the Define details section, enter a name and an optional description for your deployment notebook.
Click Save.

When you save your work in a deployment notebook, watsonx.ai automatically generates a notebook which contains the code to test, promote, and deploy an AI service. To create an online deployment for your AI service, run the cells in the deployment notebook.

You can customize the auto-generated deployment notebook for your agentic AI applications by clicking the Edit icon.

Inferencing AI services deployed from Agent Lab

The AI service deployed by the notebook can be consumed using a REST API. The following is an example cURL request to call your deployment:

curl --location '${PUBLIC_ENDPOINT}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ${IAM_TOKEN}' \
--data '{ \
  "messages": [{$MESSAGES}]
}'

where

PUBLIC_ENDPOINT is the public endpoint of your deployment.
IAM_TOKEN is the authentication token to access IBM Cloud services. The access token that you use must be associated with the same account as the project that is referenced in the notebook.

MESSAGES is an array of the chat history text entries with the following schema:

{
 "role": type, // "user" or "assistant"
 "content": content // The text content of the message
}

Deploy an AI service with Prompt Lab

When you use the Prompt Lab to deploy your work as an AI service, the logic for your RAG application is automatically captured in an AI service asset and an online deployment is created automatically for the asset.

Before you begin

You must have an existing target deployment space or create a new one where you want to deploy your AI service asset.
You must create a vector index (in-memory vector store or vector database) to chat with documents. For more information, see Chatting with documents and images.
Build a generative AI solution that uses RAG from your project.
You must set up your task credentials by generating an API key. For more information, see Managing task credentials.

Deploying an AI service directly

Follow these steps to create an online deployment for an AI service from the watsonx.ai Prompt lab:

To deploy your work from the Prompt Lab as an AI service, click Deploy.
Enter your deployment details, choose your deployment space, and click Create.

Deploying AI services with a deployment notebook in Prompt Lab

To customize the programming logic of your generative AI application, you can use the Prompt Lab to save your work in a deployment notebook. When you save your work in a deployment notebook, watsonx.ai automatically generates a deployment notebook which captures the logic of your generative AI application in an AI service.

To save your work in a deployment notebook that contains an AI service from the watsonx.ai Prompt lab, follow these steps:

Work with the Prompt Lab to create a generative AI solution.
Click the Save icon and select Save as from the dropdrown menu.
In the Save you work dialog box, select Deployment notebook.
Note: The deployment notebook contains the code to test, promote, and deploy an AI service. For deploying your application, you must save your work in a deployment notebook. You cannot use a standard notebook to deploy an AI service asset.
In the Define details section, enter a name and an optional description for your deployment notebook.
Click Save.

You can customize the auto-generated deployment notebook for your generative AI applications by clicking the Edit icon .

Inferencing AI services deployed from Prompt Lab

The AI service deployed by the notebook can be consumed using a REST API. The following is an example cURL request to call your deployment:

curl --location '${PUBLIC_ENDPOINT}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ${IAM_TOKEN}' \
--data '{ \
  "input_data": [{
      "fields": ["Search", "Access token"],
      "values": [
        [${MESSAGES}],
        [${IAM_TOKEN}]]
    }]
}'

where

PUBLIC_ENDPOINT is the public endpoint of your deployment.
IAM_TOKEN is the authentication token to access IBM Cloud services. The access token that you use must be associated with the same account as the project that is referenced in the notebook.

MESSAGES is an array of the chat history text entries with the following schema:

{
 "role": type, // "user" or "assistant"
 "content": content // The text content of the message
}

Deploying AI services with AutoAI

When you use AutoAI to create an experiment that uses RAG and deploy your work as an AI service, the logic for your RAG application is automatically captured in an AI service asset and an online deployment is created automatically for the asset.

Before you begin

You must have an existing target deployment space or create a new one where you want to deploy your AI service asset.
You must create a vector index (in-memory vector store or vector database) to chat with documents. For more information, see Chatting with documents and images.
Build a generative AI solution that uses RAG with the AutoAI experiment builder from your project. For more information, see Building an AutoAI experiment that uses RAG.
You must set up your task credentials by generating an API key. For more information, see Managing task credentials.

Deploying AI services directly

Follow these steps to create an online deployment for an AI service from the AutoAI experiment builder tool:

To deploy your work from the AutoAI experiment builder, choose the best performing pipeline for deployment and click Save as.
Choose Retrieval and generation as the objective, and select the AI service asset type.
Enable the option to promote and deploy the AI service to a deployment space.
Choose your deployment space and click Create and save.

Next steps

Parent topic: Deploying AI services