Deploy a tuned model so you can add it to a business workflow and start to use foundation models in a meaningful way.
Before you begin
-
The tuning experiment that you used to tune the foundation model must be finished. For more information, see Tuning a foundation model.
-
You must set up your task credentials by generating an API key. For more information, see Managing task credentials.
Deploy a tuned model
To deploy a tuned model, complete the following steps:
-
From the navigation menu, expand Projects, and then click All projects.
-
Click to open your project.
-
From the Assets tab, click the Experiments asset type.
-
Click to open the tuning experiment for the model you want to deploy.
-
From the Tuned models list, find the completed tuning experiment, and then click New deployment.
-
Name the tuned model.
The name of the tuning experiment is used as the tuned model name if you don't change it. The name has a number after it in parentheses, which counts the deployments. The number starts at one and is incremented by one each time you deploy this tuning experiment.
-
Optional: Add a description and tags.
-
For the Deployment container, choose one of the following options:
- This project: Deploys the tuned model and adds it to your project where you can test the tuned model. You can promote the tuned model deployment to a deployment space at any time. Choose this option if you want to do more testing of the tuned model before the model is used in production.
- Deployment space: Promotes the tuned model to a deployment space and deploys the tuned model. A deployment space is separate from the project where you create the asset. This separation enables you to promote assets from multiple projects to a space, and deploy assets to more than one space. Choose this option when the tuned model is ready to be promoted for production use.
For more information about this option, see Using a deployment space.
-
Tip: Select the option to view after creating. Otherwise, you need to take more steps to find your deployed model.
-
Click Deploy.
After the tuned model is deployed, a copy of the tuned model is stored in your project as a model asset.
Using a deployment space
When you choose a deployment space as the container for your tuned model, the tuned model is promoted to a deployment space, and then deployed. A deployment space is associated with the following services that it uses to deploy assets:
-
watsonx.ai Runtime: A product with tools and services you can use to build, train, and deploy machine learning models. This service hosts your turned model.
-
IBM Cloud Object Storage: A secure platform for storing structured and unstructured data. Your deployed model asset is stored in a Cloud Object Storage bucket that is associated with your project.
For more information, see Deployment spaces.
To use a deployment space, complete the following steps:
-
After you choose Deployment space as the deployment container, in the Target deployment space field, choose a deployment space.
The deployment space must be associated with a machine learning instance that is in the same account as the project where the tuned model was created.
If you don't have a deployment space, choose Create a new deployment space, and then follow the steps in Creating deployment spaces.
-
In the Deployment serving name field, add a label for the deployment.
The serving name is used in the URL for the API endpoint that identifies your deployment. Adding a name is helpful because the human-readable name that you add replaces a long, system-generated ID that is assigned otherwise.
The serving name also abstracts the deployment from its service instance details. Applications can refer to this name which allows for the underlying service instance to be changed without impacting users.
The name can have up to 36 characters. The supported characters are [a-z,0-9,_].
The name must be unique across the IBM Cloud region. You might be prompted to change the serving name if the name you choose is already in use.
Testing the deployed model
The true test of your tuned model is how it responds to input that follows tuned-for patterns.
You can test the tuned model from one of the following pages:
- Project: Useful when you want to test your model during the development and testing phases before moving it into production.
- Deployment space: Useful when you want to test your model programmatically. From the API Reference tab, you can find information about the available endpoints and code examples. You can also submit input as text and choose to return the output or in a stream, as the output is generated. However, you cannot change the prompt parameters for the input text.
- Prompt Lab: Useful when you want to use a tool with an intuitive user interface for prompting foundation models. You can customize the prompt parameters for each input. You can also save the prompt as a notebook so you can interact with it programmatically.
Testing the deployment model in a project
To test your tuned model in the project, complete the following steps:
-
From your project, click the Deployments tab.
-
Click the name of your deployed model.
-
Click the Test tab.
-
In the Input data field, add a prompt that follows the prompt pattern that your tuned model is trained to recognize, and then click Generate.
You can click View parameter settings to see the prompt parameters that are applied to the model by default. To change the prompt parameters, you must go to the Prompt Lab.
Testing the deployment model in a deployment space
To test your tuned model in a deployment space, complete the following steps:
-
From the navigation menu, select Deployments.
-
Click the name of the deployment space where you deployed the tuned model.
-
Click the name of your deployed model.
-
Click the Test tab.
-
In the Input data field, add a prompt that follows the prompt pattern that your tuned model is trained to recognize, and then click Generate.
You can click View parameter settings to see the prompt parameters that are applied to the model by default. To change the prompt parameters, you must go to the Prompt Lab.
Testing the deployment model in Prompt Lab
To test your tuned model in Prompt Lab, complete the following steps:
-
Follow the steps in the previous procedure to open the deployed model in either the project or deployment space.
-
In the project, click Open in Prompt Lab. If you are working in a deployment space, you are prompted to choose the project where you want to work with the model.
Prompt Lab opens and the tuned model that you deployed is selected from the Model field.
-
In the Try section, add a prompt to the Input field that follows the prompt pattern that your tuned model is trained to recognize, and then click Generate.
For more information about how to use the prompt editor, see Prompt Lab.
Learn more
Parent topic: Deploying foundation model assets