You can build a customized AI service which is tailored to your generative AI application from the ground up. For example, if you are deploying an asset that uses retrieval augmented generation (RAG), you can capture the logic for retrieving answers from the grounding documents in the AI service.
Methods for deploying AI services with code
You can use the following methods for coding and deploying your AI services:
-
Coding and deploying AI services manually
You can create a notebook that contains the AI service and connections within the Project. The AI service captures the logic of your RAG application and contains the generation function, which is a deployable unit of code. The generation function is promoted to the deployment space, which is used to create a deployment. The deployment is exposed as a REST API endpoint that can be accessed by other applications. You can send a request to the REST API endpoint to use the deployed AI service for inferencing. The deployed AI service processes the request and returns a response.
For more information, see Coding and deploying AI services manually.
-
Coding and deploying AI services with templates
You can use pre-defined templates to deploy your AI services in watsonx.ai. AI service templates provide a pre-built foundation for AI applications, enabling developers to focus on the core logic of their application, rather than starting from scratch. By providing a pre-defined structure, configuration, and set of tools, AI service templates simplify the process of deploying AI services, reduce the risk of errors, and improve the overall efficiency and consistency of AI development and deployment.
For more information, see Coding and deploying AI services with templates.
-
Coding and deploying AI services with CPDCTL
CPDCTL is a command-line tool for deploying and managing AI services on the IBM Cloud Pak for Data (CPD) platform. It provides a simple and streamlined way to deploy AI services, eliminating the need for manual configuration and reducing the risk of errors. The process of deploying an AI service with CPDCTL involves preparing the environment by installing CPDCTL and configuring environment variables, creating an AI service instance, uploading the code for the AI service, and deploying the AI service to make it available for use. The deployment process is initiated by running a series of CPDCTL commands, including creating an AI service instance, uploading the code, and deploying the AI service. The deployed AI service can then be accessed through a REST API endpoint.
For more information, see Coding and deploying AI services with CPDCTL.
Choosing the right method for deployment
There are three approaches to deploying AI services: coding manually, developer templates, and CPDCTL. Each approach has its advantages and disadvantages. The choice of deployment approach depends on the specific needs and requirements of the project. Developer templates are suitable for simple deployments with limited customization needs, while manual coding is suitable for complex deployments with high customization needs. CPDCTL is suitable for deployments that require simplicity and scalability.
The following table provides a comparison summary of the three approaches for deploying AI services with code:
Approach | Ease of use | Customization | Scalability | Time-to-market |
---|---|---|---|---|
Manual coding | Difficult | Full | High | Slow |
Developer templates | Easy | Limited | Limited | Fast |
CPDCTL | Easy | Limited | High | Fast |
Learn more
Parent topic: Deploying AI services