An AI service is a deployable unit of code that you can use to capture the logic of your generative AI use cases. When your AI services are successfully deployed, you can use the endpoint for inferencing from your application.
Deploying generative AI applications with AI services
Copy link to section
While Python functions are the traditional way to deploy machine learning assets, AI services offer a more flexible option to deploy code for generative AI applications like streaming.
Unlike the standard Python function for deploying a predictive machine learning model, which requires input in a fixed schema, an AI service provides flexibility for multiple inputs and allows for customization.
AI services offer a secure solution to deploy your code functions. For example, credentials such as bearer tokens that are required for authentication are generated from task crendentials by the service and the token is made available to the
AI service asset. You can use this token to get connection assets, download data assets, and more.
Deploying AI services visually
Copy link to section
You can deploy your AI service directly to a deployment space by following a no-code approach from the user interface. Use this approach to create an online or batch deployment for your use case.
You can use the following visual tools to create a generative AI solution in watsonx.ai:
Prompt Lab
AutoAI
Agent Lab
When you use visual tools to create a generative AI solution for a complex use case, such as RAG, your solution is deployed as an AI service. You can choose to deploy your solution directly from the user interface or export your solution in
an editable notebook in Python that deploys the AI service. The notebook automatically generates the code to create an AI service in a standard format, and provides you a way to add more functionality or update after testing. While tools provide
a user-friendly interface to create and deploy AI services, coding offers more flexibility and customization options.
When you build your generative AI applications from the ground up, you can use an AI service to capture the programming logic of your application, which can be deployed with an endpoint for inferencing. For example, if you build a RAG application
with frameworks such as LangChain, LlamaIndex, or more, you can use an AI service to capture the logic for retrieving answers from the vector index in the AI service and deploying the AI service.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.