You can work with foundation models in IBM watsonx.ai programmatically by using the watsonx.ai API.
See the API reference documentation.
You can use the REST API to perform the following sorts of tasks. Click a task to read an overview of the method.
You can also use the REST API to perform the following tasks:
- Add generative chat to your applications with the chat API
- Build agent-driven chat workflows
- Extract text from documents
- Vectorize text
- Rerank document passages
- Forecast future values
Prerequisites
Go to the Developer access page for quick access to the following information:
- Base URL for API endpoints and your
- Project or space ID
From the watsonx.ai home page for the project or space that you want to work with, open the Navigation Menu , and then click Developer access.
You also need the following information to submit REST API requests:
-
To use the watsonx.ai API, you need a bearer token.
For more information, see Credentials for programmatic access.
-
You must specify the
{model_id}
for the foundation model that you want to use.You can use the List the available foundation models method to get the ID for a foundation model.
For a list of the model IDs for the foundation models that are included with watsonx.ai, see Foundation model IDs for APIs.
-
Specify the date on which you created and tested your code in the version parameter that is required with each request. For example,
version=2024-10-21
.
Some tasks require you to reference data that is made available as a data connection. For more information about how to add a file, and then reference a file from the API, see Referencing files from the API.
Getting a list of available foundation models
The List the available foundation models method in the watsonx.ai API gets information about the foundation models that are provided with watsonx.ai and are available for inferencing.
curl -X GET \
'https://{region}.ml.cloud.ibm.com/ml/v1/foundation_model_specs?version=2024-05-01'
To get a list of available custom foundation models, use the following method:
curl -X GET \
'https://<your cloud hostname>/ml/v4/deployments/<your deployment ID>?version=2024-01-29&project_id=<your project ID>'
For more information, see Creating a deployment programmatically.
Inferencing foundation models with the REST API
The method that you use to inference a foundation model differs depending on whether the foundation model is associated with a deployment.
-
To inference a foundation model that is provided with watsonx.ai, use the Text generation method.
-
To inference a tuned foundation model, use the Deployments>Infer text method.
The
{model_id}
is not required with this type of request because only one model is supported by the deployment.
You can prompt a foundation model by using one of the following text generation methods:
- Infer text: Waits to return the output that is generated by the foundation model all at one time.
- Infer text event stream: Returns the output as it is generated by the foundation model. This method is useful in conversational use cases, where you want a chatbot or virtual assistant to respond to a user in a fluid way that mimics a real conversation.
When you prompt a foundation model by using the API, you can use the moderations
field to apply AI guardrails to foundation model input and output. For more information, see Removing harmful language from model input and output.
For chat use cases, use the chat API. For more information, see Text chat.
You can also inference a foundation model with input text that follows a pattern that is defined by a prompt template.
For more information, see Create a prompt template.
To extract prompt template text to use as input to the text generation method, take the following steps:
-
Use the Search asset types method of the Watson Data API to get the prompt template ID.
curl -X POST \ 'https://api.dataplatform.cloud.ibm.com/v2/asset_types/wx_prompt/search?version=2024-07-29&project_id={project_id}' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ACCESS_TOKEN' \ --data '{ "query": "asset.name:{template_name}" }'
The prompt template ID is specified as the
metadata.asset_id
. -
Use the Get the inference input string for a given prompt method to get the prompt template text.
curl -X POST \ 'https://api.dataplatform.cloud.ibm.com/wx/v1/prompts/{prompt-template-id}/input?version=2024-07-29&project_id={project_id}' ...
For more information, see Get the inference input string for a given prompt
You can submit the extracted prompt text as input to the Generate text method.
Prompt tuning a foundation model
Prompt tuning a foundation model is a complex task. The sample Python notebooks simplify the process. You can use a sample notebook as a template for writing your own notebooks for prompt tuning.
At a high level, prompt tuning a foundation model by using the API involves the following steps:
-
Create a training data file to use for tuning the foundation model.
For more information about the training data file requirements, see Data formats for tuning foundation models.
-
Upload your training data file.
You can choose to add the file by creating one of the following asset types:
-
Connection asset
Note: Only a Cloud Object Storage connection type is supported for prompt tuning training currently.
See Referencing files from the API.
You will use the connection ID and training data file details when you add the
training_data_references
section to therequest.json
file that you create in the next step. -
Data asset
To create a data asset, use the Data and AI Common Core API to define a data asset.
You will use the asset ID and training data file details when you add the
training_data_references
section to therequest.json
file that you create in the next step.
For more information about the supported ways to reference a training data file, see Data references.
-
-
Use the watsonx.ai API to create a training experiment.
See create a training.
You can specify parameters for the experiment in the
TrainingResource
payload. For more information about available parameters, see Parameters for tuning foundation models.For the
task_id
, specify one of the tasks that are listed as being supported for the foundation model in the response to the List the available foundation models method. -
Save the tuned model to the repository service to generate an
asset_id
that points to the tuned model.To save the tuned model, use the watsonx.ai Runtime (formerly Watson Machine Learning) API to create a new model.
-
Use the watsonx.ai API to create a deployment for the tuned model.
To inference a tuned model, you must use the inference endpoint that includes the unique ID of the deployment that hosts the tuned model. For more information, see the inference methods in the Deployments section.
Python notebooks
The Python library has helper classes and associated sample notebooks that make it easier to use the available API methods in your generative AI applications. For more information, see Python library.
Parent topic: Coding generative AI solutions