After you save a prompt template as a project asset, you can promote it to a deployment space. From the deployment space, you can deploy your prompt template to production and get the endpoint for inferencing.
If you have the watsonx.governance service, you can also capture and track the deployment details for a prompt template to meet governance requirements.
An AI service is a deployable unit of code that captures the logic of your generative AI use cases, such as Retrieval Augmented Generation (RAG). When your AI services are successfully deployed, you can use the endpoint for inferencing from
your application.
AI services are created automatically when you deploy a complex generative AI solution with visual tools such as the Agent Lab, Prompt Lab or AutoAI. For example, if you use the Agent Lab or Prompt Lab to build and deploy your agentic or generative
AI solution, the tool automatically detects the complexity of the solution and presents the correct type of deployment asset.
Although you can use the prompt templates to create and deploy saved prompts in the Prompt Lab, you cannot use them to deploy generative AI applications for complex use cases such as RAG.
If you choose to code your generative AI application that is based on these complex use cases, you must create an AI service and ensure that it follows certain requirements. You can deploy an AI service programmatically with watsonx.ai REST
API or Python client library. After deploying the AI service, you can use the endpoint for inferencing.
After you tune a foundation model and save the tuned model as a project asset, you can promote it to a deployments space. From the deployment space, you can test the tuned model and get the endpoint for inferencing.
In addition to working with foundation models that are curated by IBM, you can upload and deploy your own foundation models. After the models are deployed and registered with watsonx.ai, create prompts that inference the custom models from the
Prompt Lab.
Deploying a custom foundation model provides the flexibility for you to implement the AI solutions that are right for your use case.
Deploy a foundation model on-demand on dedicated hardware to make the foundation model available for use in various applications and services as needed. By using this approach, you can access the capabilities of these powerful foundation models
without the need for extensive computational resources. Foundation models that you deploy on-demand are hosted in a dedicated deployment space where you can use these models for inferencing.