0 / 0
IBM watsonx.ai REST API
Last updated: Dec 12, 2024
IBM watsonx.ai REST API

You can work with foundation models in IBM watsonx.ai programmatically by using the watsonx.ai API.

See the API reference documentation.

You can use the REST API to perform the following sorts of tasks. Click a task to read an overview of the method.

You can also use the REST API to perform the following tasks:

Prerequisites

Go to the Developer access page for quick access to the following information:

  • Base URL for API endpoints and your
  • Project or space ID

From the watsonx.ai home page for the project or space that you want to work with, open the Navigation Menu Navigation Menu icon, and then click Developer access.

You also need the following information to submit REST API requests:

  • To use the watsonx.ai API, you need a bearer token.

    For more information, see Credentials for programmatic access.

  • You must specify the {model_id} for the foundation model that you want to use.

    You can use the List the available foundation models method to get the ID for a foundation model.

    For a list of the model IDs for the foundation models that are included with watsonx.ai, see Foundation model IDs for APIs.

  • Specify the date on which you created and tested your code in the version parameter that is required with each request. For example, version=2024-10-21.

Some tasks require you to reference data that is made available as a data connection. For more information about how to add a file, and then reference a file from the API, see Referencing files from the API.

Tip: If you want help with formatting an inference request in the API, you can submit the same request from the Prompt Lab. From the code panel of Prompt Lab, which shows the cURL request that is generated by your prompt, you can check the syntax that is used.

Getting a list of available foundation models

The List the available foundation models method in the watsonx.ai API gets information about the foundation models that are deployed by IBM in watsonx.ai and are available for inferencing immediately.

curl -X GET \
  'https://{region}.ml.cloud.ibm.com/ml/v1/foundation_model_specs?version=2024-05-01'

Listing custom foundation models

To get a list of deployed custom foundation models that you can access, use the following method. This method requires a bearer token.

curl -X GET \
  'https://{region}.ml.cloud.ibm.com/ml/v4/deployments?version=2024-12-12&type=custom_foundation_model'

Listing deploy on demand models

To get a list of available deploy on demand foundation models, use the following method:

curl -X GET \
  'https://{region}.ml.cloud.ibm.com/ml/v1/foundation_model_specs?version=2024-12-10&filters=curated'

You must deploy the deploy on demand model to a deployment space before you can inference the model.

Inferencing foundation models with the REST API

The method that you use to inference a foundation model differs depending on whether the foundation model is associated with a deployment.

  • To inference a foundation model that is deployed by IBM in watsonx.ai, use the Text generation method.

  • To inference a tuned foundation model, a custom foundation model, or a deploy on demand foundation model, use the Deployments>Infer text method.

    The {model_id} is not required with this type of request because only one model is supported by the deployment.

Inference types

You can prompt a foundation model by using one of the following text generation methods:

  • Infer text: Waits to return the output that is generated by the foundation model all at one time.
  • Infer text event stream: Returns the output as it is generated by the foundation model. This method is useful in conversational use cases, where you want a chatbot or virtual assistant to respond to a user in a fluid way that mimics a real conversation.

For chat use cases, use the chat API. For more information, see Text chat.

Applying AI guardrails when inferencing

When you prompt a foundation model by using the API, you can use the moderations field to apply AI guardrails to foundation model input and output. For more information, see Removing harmful language from model input and output.

Inferencing with a prompt template

You can inference a foundation model with input text that follows a pattern that is defined by a prompt template.

For more information, see Create a prompt template.

To extract prompt template text to use as input to the text generation method, take the following steps:

  1. Use the Search asset types method of the Watson Data API to get the prompt template ID.

    curl -X POST \
    'https://api.dataplatform.cloud.ibm.com/v2/asset_types/wx_prompt/search?version=2024-07-29&project_id={project_id}' \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer ACCESS_TOKEN' \
    --data '{
     "query": "asset.name:{template_name}"
    }'
    

    The prompt template ID is specified as the metadata.asset_id.

  2. Use the Get the inference input string for a given prompt method to get the prompt template text.

    curl -X POST \
    'https://api.dataplatform.cloud.ibm.com/wx/v1/prompts/{prompt-template-id}/input?version=2024-07-29&project_id={project_id}'
    ...
    

    For more information, see Get the inference input string for a given prompt

    You can submit the extracted prompt text as input to the Generate text method.

Prompt tuning a foundation model

Prompt tuning a foundation model is a complex task. The sample Python notebooks simplify the process. You can use a sample notebook as a template for writing your own notebooks for prompt tuning.

At a high level, prompt tuning a foundation model by using the API involves the following steps:

  1. Create a training data file to use for tuning the foundation model.

    For more information about the training data file requirements, see Data formats for tuning foundation models.

  2. Upload your training data file.

    You can choose to add the file by creating one of the following asset types:

    • Connection asset

      Note: Only a Cloud Object Storage connection type is supported for prompt tuning training currently.

      See Referencing files from the API.

      You will use the connection ID and training data file details when you add the training_data_references section to the request.json file that you create in the next step.

    • Data asset

      To create a data asset, use the Data and AI Common Core API to define a data asset.

      You will use the asset ID and training data file details when you add the training_data_references section to the request.json file that you create in the next step.

    For more information about the supported ways to reference a training data file, see Data references.

  3. Use the watsonx.ai API to create a training experiment.

    See create a training.

    You can specify parameters for the experiment in the TrainingResource payload. For more information about available parameters, see Parameters for tuning foundation models.

    For the task_id, specify one of the tasks that are listed as being supported for the foundation model in the response to the List the available foundation models method.

  4. Save the tuned model to the repository service to generate an asset_id that points to the tuned model.

    To save the tuned model, use the watsonx.ai Runtime (formerly Watson Machine Learning) API to create a new model.

  5. Use the watsonx.ai API to create a deployment for the tuned model.

    See create a deployment

To inference a tuned model, you must use the inference endpoint that includes the unique ID of the deployment that hosts the tuned model. For more information, see the inference methods in the Deployments section.

Python notebooks

The Python library has helper classes and associated sample notebooks that make it easier to use the available API methods in your generative AI applications. For more information, see Python library.

Parent topic: Coding generative AI solutions

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more