0 / 0
Deploying AI services with direct coding
Last updated: Nov 08, 2024
Deploying AI services with direct coding

You can build a customized AI service which is tailored to your generative AI application from the ground up. For example, if you are deploying an asset that uses retrieval augmented generation (RAG), you can capture the logic for retrieving answers from the grounding documents in the AI service.

Process overview

The following graphic illustrates the process of coding AI services.

You can create a notebook that contains the AI service and connections within the Project. The AI service captures the logic of your RAG application and contains the generateration function, which is a deployable unit of code. The generation function is promoted to the deployment space, which is used to create a deployment. The deployment is exposed as a REST API endpoint that can be accessed by other applications. You can send a request to the REST API endpoint to use the deployed AI service for inferencing. The deployed AI service processes the request and returns a response.

Direct coding use case

Tasks for creating and deploying AI services

Here are the steps that you must follow to create, deploy, and manage AI services:

  1. Create AI service: Define an AI service in a notebook by using Python. The AI service must meet specific requirements for deploying as an AI service.
  2. Test AI service: Test the coding logic of your AI service locally.
  3. Create AI service assets: After creating and testing the AI service, you must package the AI service as a deployable asset.
  4. Deploy AI service assets: Deploy the AI service asset as an online or a batch deployment.
  5. Testing AI service deployment: Test your deployed AI service for online inferencing or batch scoring.
  6. Manage AI services: Access and update the deployment details. Scale or delete the deployment from the user interface or programmatically.

Sample notebooks for creating and deploying AI services

To learn how to create and deploy AI services programmatically, see the following sample notebooks:

Sample notebooks for AI services
Sample name Framework Techniques demonstrated
Use watsonx, and meta-llama/llama-3-2-11b-vision-instruct to run as an AI service Python Set up the environment
Create an AI service
Test AI service's function locally
Deploy AI service
Running an AI service
Use watsonx, Elasticsearch, and LangChain to answer questions (RAG) LangChain Set up the environment
Download the test dataset.
Defining the foundation model on watsonx
Set up connectivity information to Elasticsearch
Generate a retrieval-augmented response to a question
Creating an AI service
Testing AI service function locally
Deploying the AI service
Use watsonx and meta-llama/llama-3-1-70b-instruct to create AI service LangGraph Set up the environment
Create an AI service
Test AI service's function locally
Deploy AI service
Running an AI service

Learn more

Parent topic: Deploying AI services

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more