Deploying AI services from visual tools and projects
When you use visual tools to build a generative or agentic AI application for a complex use case such as Retrieval Augmented Generation (RAG) or agentic AI, your application is deployed as an AI service. An AI service is a deployable unit of code that captures the logic of your generative AI application. For example, an AI service for a prompt that chats with grounding documents can manage the logic to retrieve content from the vectorized document index as well as the inferencing with a foundation model to generate a response. An online deployment provides an endpoint for real-time inferencing. After you deploy an AI service as an online deployment, you can test it from the testing interface provided in the space or access the endpoint to put the deployment into production.
Visual tools that deploy AI services
When you build an application with any of the following tools, your application is deployed as an AI service:
- Agent Lab: You can use the Agent Lab to build and deploy agentic AI solutions in watsonx.ai. Agentic AI solutions that you build in the Agent Lab are deployed as AI services.
- Prompt Lab: You can use the Prompt Lab to build and deploy a generative AI solution for a complex use case, such as Retrieval Augmented Generation (RAG). Generative AI solutions for complex use cases that you build in the Agent Lab are deployed as AI services.
- AutoAI (for RAG): You can use AutoAI to build RAG-based generative AI experiments and deploy the pipeline that performs the best as an AI service.
Deployment methods
Depending on your requirements, you can deploy your application as an AI service from a supported visual tool by using one of the following approaches:
- Deploying directly (fast path): Use this option if your solution is complete and you don’t want to make further changes. If you choose to deploy by using a fast path, an online deployment is created automatically.
- Deployment notebook: Use this option if you want to customize your solution by adding or altering the code, such as creating a batch deployment.
In addition to visual tools, you can manually deploy your application as an AI service directly from the Project if you created an AI service asset programmatically.
The AI service captures the logic for your use case and contains the generation function, which is a deployable unit of code. The deployment is exposed as a REST API endpoint that can be accessed by other applications. You can send a request to the REST API endpoint for using the deployed AI service for inferencing. The deployed AI service processes the request and returns a response.
Deploying from tools
Use the fast path to deploy directly from a visual tool or save your work in a deployment notebook to deploy your solution as an AI service.
Deploying with fast path
If you used a supported visual tool for building your application in watsonx.ai, you can deploy your solution directly from the tool. Deploying directly from the tools offers you a fast-path to deploying your solution as an AI service and creates an online deployment automatically. Since the deployment cannot be altered after it is created, you must ensure that the solution is fully built before you deploy it.
To deploy a complex solution such as RAG from Prompt Lab or to deploy an agent from the Agent Lab, follow these steps:
- Click Deploy from the workspace.
- Enter your deployment details, choose or create your deployment space, and click Create.
To deploy a RAG pattern from AutoAI, follow these steps:
- From the AutoAI experiment builder, choose the best performing pipeline and click Save as.
- Choose Retrieval and generation as the objective, and select the AI service asset type.
- Enable the option to promote and deploy the AI service to a deployment space.
- Choose an existing deployment space or create a new one and click Create and deploy.
This procedure automatically creates an online deployment. The deployment is created and opens in the target space so that you can test the deployment or access the endpoint for inferencing.
Deploying from auto-generated notebook
After building an application in a visual tool, if you want to customize the logic of your application before deployment, you can save your work in a deployment notebook. For example, you can edit and run an AutoAI RAG notebook if you want to add new documents to the vectorized database, then apply the optimized RAG pattern you discovered with the AutoAI tool to the updated index.
The deployment notebook contains the code to test, promote, and deploy an AI service. To deploy your application, you must save your work in a deployment notebook. You cannot use a standard notebook to deploy an AI service asset.
To save your solution in a deployment notebook:
- Click the Save icon
and select Save as from the dropdrown menu.
- In the Save you work dialog box, select Deployment notebook.
- In the Define details section, enter a name and an optional description for your deployment notebook.
- Click Save.
When you save your work in a deployment notebook, watsonx.ai automatically generates a notebook which contains the code to test, promote, and deploy an AI service. To create an online deployment for your AI service, run the cells in the deployment notebook.
Deploying from a project
Create an online or a batch deployment to deploy your application as an AI service. Online deployments are suitable for applications requiring high availability and real-time updates, while batch deployments are ideal for complex updates or when scheduled downtime is acceptable.
Creating online deployments
If you have saved an AI asset to a project as a deployable AI service asset, follow these steps to promote the AI service and create an online deployment:
- From the Assets tab of your project or deployment space, select Deploy for the AI service.
- Choose or create a deployment space.
- Select Online as the deployment type.
- Enter a name for your deployment and optionally enter a serving name, description, and tags.
- Click Create.
Creating batch deployments
If you have saved an AI asset to a project as a deployable AI service asset, follow these steps to promote the AI service and create a batch deployment:
- From your deployment space, go th the Assets tab.
- For your AI service asset in the assets list, click the Menu icon, and select Deploy.
- Select Batch as the deployment type.
- Enter a name for your deployment and optionally enter a serving name, description, and tags.
- Select a hardware specification:
- Extra small: 1 CPU and 4 GB RAM
- Small: 2 CPU and 8 GB RAM
- Medium: 4 CPU and 16 GB RAM
- Large: 8 CPU and 32 GB RAM
- Extra large: 16 CPU and 64 GB RAM
- Click Create.
Next steps
Parent topic: Deploying AI services