0 / 0
Deploying generative AI assets
Last updated: Dec 03, 2024
Deploying generative AI assets

Deploy generative AI assets to use them in production and monitor these deployed assets.

Types of deployable assets for generative AI applications

You can use watsonx.ai to deploy the following assets for your generative AI applications:

Deploying prompt templates

After you save a prompt template as a project asset, you can promote it to a deployment space. From the deployment space, you can deploy your prompt template to production and get the endpoint for inferencing.

If you have the watsonx.governance service, you can also capture and track the deployment details for a prompt template to meet governance requirements.

For more information, see Deploying a prompt template.

Deploying AI services

An AI service is a deployable unit of code that you can use to capture the logic of your generative AI use cases, such as Retrieval Augmented Generation (RAG). When your AI services are successfully deployed, you can use the endpoint for inferencing from your application.

Although you can use prompt templates to create and deploy saved prompts in the Prompt Lab, you cannot use them to deploy generative AI applications that use Retrieval Augmented Generation (RAG). To deploy RAG applications, you must deploy an AI service. AI services provide you with the option to deploy your RAG applications and use the endpoint for inferencing.

For more information, see Deploying AI services.

Deploying tuned models

After you tune a foundation model and save the tuned model as a project asset, you can promote it to a deployments space. From the deployment space, you can test the tuned model and get the endpoint for inferencing.

For more information, see Deploying a tuned foundation model.

Deploying custom foundation models

In addition to working with foundation models that are curated by IBM, you can upload and deploy your own foundation models. After the models are deployed and registered with watsonx.ai, create prompts that inference the custom models from the Prompt Lab.

Deploying a custom foundation model provides the flexibility for you to implement the AI solutions that are right for your use case.

For more information, see Deploying a custom foundation model.

Deploying foundation models on-demand

Deploy a foundation model on-demand on dedicated hardware to make the foundation model available for use in various applications and services as needed. By using this approach, you can access the capabilities of these powerful foundation models without the need for extensive computational resources. Foundation models that you deploy on-demand are hosted in a dedicated deployment space where you can use these models for inferencing.

For more information, see Deploying foundation models on-demand.

Learn more

Parent topic: Deploying assets with watsonx.ai Runtime

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more