0 / 0
Deploying foundation models on demand (fast path)
Last updated: Dec 05, 2024
Deploying foundation models on demand (fast path)

Deploy a foundation model on-demand on dedicated hardware in just a few steps. IBM watsonx.ai provides a curated set of popular foundation models that you can deploy on-demand in a dedicated deployment space for the exclusive use of users with access to the space. The fast path for deploying a foundation model on-demand is to select and deploy a model from the Resource hub.

Before you begin

  1. You must set up or enable your task credentials to deploy foundation models on-demand. For more information, see Managing task credentials.
  2. Review supported foundation model architectures, deployment types, and other considerations for deploying a foundation model on-demand. For more information, see Deploying foundation models on-demand.

Watch this video to see how to deploy a foundation model on-demand.

This video provides a visual method to learn the concepts and tasks in this documentation.

Deploying an on-demand foundation model

To deploy a foundation model on-demand from the Resource hub, complete the following steps:

  1. Open the Resource hub from the Navigation Menu Navigation Menu icon.

    Tip:

    Choose the Deploy-on-demand filer to display a list of models that you can deploy on demand.

  2. From the Pay by the hour section, find the model that you want to deploy on demand.

    Screenshot showing the list of foundation models available for on-demand deployment in the Resource hub

  3. From the model details page, click Deploy.

    Screenshot showing the model details page

  4. Click Deploy from the foundation model tile, and then choose the deployment space where you want the foundation model to be deployed.

    Screenshot shows how to create the deployment

    Important:

    You can deploy only one instance of a foundation model on demand in a deployment space. If the selected model is already deployed, the existing deployment link will be available to the user under the Details section. For more information, see Troubleshooting watsonx.ai Runtime.

  5. Click Create.

After the model is deployed, you can prompt the foundation model from the Prompt Lab or watsonx.ai API.

Testing the deployment

Follow these steps to test a foundation model that is deployed on-demand:

  1. In your deployment space, open the Deployments tab and click the deployment name.

  2. Click the Test tab to input prompt text and get a response from the deployed asset.

  3. Enter test data in one of the following formats, depending on the type of asset that you deployed:

    a. Text: Enter text input data to generate a block of text as output.
    b. Stream: Enter text input data to generate a stream of text as output.
    c. JSON: Enter JSON input data to generate output in JSON format.

    Testing foundation model deployed on-demand

  4. Click Generate to get results that are based on your prompt.

Managing the deployment

Access, update, scale, or delete your foundation model that is deployed on-demand from the Resource hub.

Accessing the deployed model

You can access the foundation model that is deployed on-demand from the Resource hub by using the deployment link.

Follow these steps to access the deployment link from the Resource hub:

  1. From the navigation menu, go to the Resource hub.

  2. From the Foundation model catalog in the Resource hub, select the model that you deployed.

  3. In the Details section of the model details page, click the Deployment link.

    Accessing the model that is deployed on-demand from the Resource hub

Alternatively, you can also access the details about your foundation model that is deployed on-demand such as deployment ID, software specification, associated asset, and more from the deployment details page.

Accessing the deploy-on-demand model from the deployment space

Updating the deployment

Update the required details for your foundation model that is deployed on-demand such as name, description, tags, and more. For more information, see Updating a deployment.

Restriction: Replacing the asset is not supported for foundation models that are deployed on-demand.

Updating the deploy-on-demand model from the deployment space

Scaling the deployment

You can deploy only one instance of a foundation model on-demand in a deployment space. To handle increased demand, you can scale the deployment by creating additional copies. For more information, see Scaling a deployment.

Updating the deploy-on-demand model from the deployment space

Deleting the deployment

When your work with the foundation model that is deployed on-demand is complete, delete the deployment to stop the billing charges. For more information, see Deleting a deployment.

Learn more

Parent topic: Deploying foundation models on-demand

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more