Deploy a foundation model on-demand on dedicated hardware in just a few steps. IBM watsonx.ai provides a curated set of popular foundation models that you can deploy on-demand in a dedicated deployment space for the exclusive use of users with access to the space. The fast path for deploying a foundation model on-demand is to select and deploy a model from the Resource hub.
Before you begin
- You must set up or enable your task credentials to deploy foundation models on-demand. For more information, see Managing task credentials.
- Review supported foundation model architectures, deployment types, and other considerations for deploying a foundation model on-demand. For more information, see Deploying foundation models on-demand.
Watch this video to see how to deploy a foundation model on-demand.
This video provides a visual method to learn the concepts and tasks in this documentation.
Deploying an on-demand foundation model
To deploy a foundation model on-demand from the Resource hub, complete the following steps:
-
Open the Resource hub from the Navigation Menu .
Tip:Choose the Deploy-on-demand filer to display a list of models that you can deploy on demand.
-
From the Pay by the hour section, find the model that you want to deploy on demand.
-
From the model details page, click Deploy.
-
Click Deploy from the foundation model tile, and then choose the deployment space where you want the foundation model to be deployed.
Important:You can deploy only one instance of a foundation model on demand in a deployment space. If the selected model is already deployed, the existing deployment link will be available to the user under the Details section. For more information, see Troubleshooting watsonx.ai Runtime.
-
Click Create.
After the model is deployed, you can prompt the foundation model from the Prompt Lab or watsonx.ai API.
Testing the deployment
Follow these steps to test a foundation model that is deployed on-demand:
-
In your deployment space, open the Deployments tab and click the deployment name.
-
Click the Test tab to input prompt text and get a response from the deployed asset.
-
Enter test data in one of the following formats, depending on the type of asset that you deployed:
a. Text: Enter text input data to generate a block of text as output.
b. Stream: Enter text input data to generate a stream of text as output.
c. JSON: Enter JSON input data to generate output in JSON format. -
Click Generate to get results that are based on your prompt.
Managing the deployment
Access, update, scale, or delete your foundation model that is deployed on-demand from the Resource hub.
Accessing the deployed model
You can access the foundation model that is deployed on-demand from the Resource hub by using the deployment link.
Follow these steps to access the deployment link from the Resource hub:
-
From the navigation menu, go to the Resource hub.
-
From the Foundation model catalog in the Resource hub, select the model that you deployed.
-
In the Details section of the model details page, click the Deployment link.
Alternatively, you can also access the details about your foundation model that is deployed on-demand such as deployment ID, software specification, associated asset, and more from the deployment details page.
Updating the deployment
Update the required details for your foundation model that is deployed on-demand such as name, description, tags, and more. For more information, see Updating a deployment.
Scaling the deployment
You can deploy only one instance of a foundation model on-demand in a deployment space. To handle increased demand, you can scale the deployment by creating additional copies. For more information, see Scaling a deployment.
Deleting the deployment
When your work with the foundation model that is deployed on-demand is complete, delete the deployment to stop the billing charges. For more information, see Deleting a deployment.
Learn more
- Supported foundation models
- Prompt Lab
- Deploying foundation models on demand by using the REST API
- Hourly billing rates for deploy on demand models
Parent topic: Deploying foundation models on-demand