When you use the Prompt Lab to create a generative AI application that uses Retrieval Augmented Generation (RAG), you can deploy your application as an AI service by using a fast path or a deployment notebook.
Process overview
The following graphic illustrates two methods to deploy an AI service by using the Prompt Lab:
- By using a fast path to directly promote and deploy.
- By using a deployment notebook.
You can create a RAG application in the Prompt Lab by adding a connection to a vector index. To deploy the AI service, you can use the fast-path to directly promote the AI service to a deployment space and create an online deployment.
Alternatively, you can save your work in a deployment notebook, which you can use for customizing code as per your use case. The deployment notebook contains auto-generated code to create and deploy an AI service. The AI service captures the logic for running a similarity search to compute documents that match the query and inferences the model by using the query result. The AI service also contains the generateration function, which is a deployable unit of code. The generation function is promoted to the deployment space, which creates a deployment.
The deployment is exposed as a REST API endpoint that can be accessed by other applications. You can send a request to the REST API endpoint for using the deployed AI service for inferencing. The deployed AI service processes the request and returns a response.
Tasks for deploying AI services from Prompt Lab
Here are the steps that you must follow to create, deploy, and manage AI services:
- Choose a deployment method: You can create and deploy AI services from the Prompt Lab by using a fast path or a deployment notebook. Choose a method that is best suited for your use case.
- Testing AI service deployment: Test your deployed AI service for online inferencing or batch scoring.
- Manage AI services: Access and update deployment details. Scale or delete the deployment from the user interface or programmatically.
Deploying an AI service with fast path
You can use the Prompt Lab to build a RAG application by chatting with documents and providing a vector index. When you use the fast path to deploy your work as an AI service, the logic for your RAG application is automatically captured in an AI service asset and an online deployment is created automatically for the asset.
For more information, see Deploying an AI service with fast path.
Deploying an AI service with a deployment notebook
To customize the programming logic of your RAG application, you can use the Prompt Lab save your work in a deployment notebook. When you save your work in a deployment notebook, watsonx.ai automatically generates a deployment notebook which captures the logic of your RAG application in an AI service.
The deployment notebook contains auto-generated code to promote your AI service asset to a deployment space and create a deployment for the asset. You can edit the deployment notebook for customization, such as creating a batch deployment to deploy an AI service asset instead of an online deployment for your use case.
For more information, see Deploying an AI service with notebook.
Learn more
Parent topic: Deploying AI services