Programmatically inferencing a foundation model by using a prompt template

Last updated: Nov 27, 2024

You can use a saved prompt template to prompt foundation models in IBM watsonx.ai programmatically.

A prompt template is an asset that you can create to capture a combination of prompt static text, model parameters, and prompt variables that generate the results you want from a specific model so that you can reuse it.

For more information about prompt variables, see Building reusable prompts.

You can use a prompt template in the following ways:

Inference a foundation model by using a deployed prompt template.
Deploy a prompt template, and then use the deployed template in an inference request.
Add text to a prompt based on a prompt template, and then inference a foundation model (without deploying the prompt template).

Use functions that are available in the watsonx.ai Python library from notebooks in watsonx.ai to submit prompts with a prompt template.

Sample notebook

For more information about the steps to follow, see Sample notebook: Use watsonx to manage prompt template assets and create a deployment.

Inference a foundation model with a deployed prompt template

Sending an inference request to a deployed foundation model involves the following steps:

Configure the APIClient. When you inference a deployed prompt template, you must specify the deployment space for the deployment where the prompt template is hosted.

from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames

credentials=Credentials(
  url=<URL>,
  api_key=<API-KEY>,
)

client = APIClient(credentials)
client.set.default_space(space_id)

Submit an inference request to the deployed template.

In this example the generate_text method is used and the question prompt variable value is being specified.


generated_response = client.deployments.generate_text(
  deployment_id={deployment-id},
  params={
     GenTextParamsMetaNames.PROMPT_VARIABLES: {
         "question": "What plans do you offer?"
     }
  }
  )

Deploy and inference a prompt template

You can use the Python library to deploy a prompt template, and then inference a foundation model by using the deployed prompt template. The following high-level steps are involved. For the complete steps and more options, see the sample notebook.

Import and instantiate the PromptTemplateManager object.

prompt_mgr = PromptTemplateManager(
  credentials=Credentials(
    api_key=<API-KEY>,
    url=<URL>
  ),
  space_id=<SPACE_ID>
)

Define the prompt template.

from ibm_watsonx_ai import APIClient
from ibm_watsonx_ai.foundation_models.prompts import PromptTemplate
from ibm_watsonx_ai.foundation_models.utils.enums import DecodingMethods
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams

my_credentials = {
  "url": "https://{region}.ml.cloud.ibm.com",
  "apikey": {my-IBM-Cloud-API-key},
}

client = APIClient(my_credentials)

prompt_template = PromptTemplate(
  name="New prompt",
  model_id=client.foundation_models.TextModels.FLAN_T5_XXL,
  model_params = {GenParams.DECODING_METHOD: DecodingMethods.SAMPLE},
  description="My example",
  task_ids=["generation"],
  input_variables=["object"],
  instruction="Answer the following question",
  input_prefix="Human",
  output_prefix="Assistant",
  input_text="What is {object} and how does it work?",
  examples=[
    ["What is a loan and how does it work?", 
    "A loan is a debt that is repaid with interest over time."]
  ]
)

Store the prompt template in your project to generate a prompt template ID.

stored_prompt_template = prompt_mgr.store_prompt(prompt_template=prompt_template)
print(f"Asset id: {stored_prompt_template.prompt_id}")

Load the text in the prompt template.

from ibm_watsonx_ai.foundation_models.utils.enums import PromptTemplateFormats

prompt_input_text = prompt_mgr.load_prompt(
  prompt_id=stored_prompt_template.prompt_id, 
  astype=PromptTemplateFormats.STRING)
print(prompt_input_text)

Create a prompt template deployment and generate a deployment ID.

meta_props = {
  client.deployments.ConfigurationMetaNames.NAME: "SAMPLE DEPLOYMENT PROMPT TEMPLATE",
  client.deployments.ConfigurationMetaNames.ONLINE: {},
  client.deployments.ConfigurationMetaNames.BASE_MODEL_ID: "ibm/granite-13b-chat-v2"
}

deployment_details = client.deployments.create(
  artifact_id=stored_prompt_template.prompt_id, 
  meta_props=meta_props
)

Import and instantiate the ModelInference object to use for inferencing the foundation model by using the deployed prompt template.

from ibm_watsonx_ai.foundation_models import ModelInference

deployment_id = deployment_details.get("metadata", {}).get("id")

model_inference = ModelInference(
  deployment_id=deployment_id,
  api_client=client
)

Inference the foundation model. Be sure to specify values for any prompt variables that are defined in the prompt template.

from ibm_watsonx_ai.foundation_models.utils.enums import DecodingMethods

model_inference.generate_text(
  params={
    "prompt_variables": {"object": "a mortgage"},
    GenParams.DECODING_METHOD: DecodingMethods.GREEDY,
    GenParams.STOP_SEQUENCES: ['\n\n'],
    GenParams.MAX_NEW_TOKENS: 50
  }
)

Use a prompt template to draft prompt text for an inference request

Inferencing a foundation model by using a prompt template that is not deployed involves the following steps. For the complete steps and more options, see the sample notebook.

List all of the available prompt templates.

For more information, see List all prompt templates.
Load the prompt template that you want to use. The prompt template does not need to be deployed. Convert the template to prompt text.

For more information, see Load prompt.
```
prompt_text = prompt_mgr.load_prompt(prompt_id=stored_prompt_template.prompt_id, astype=PromptTemplateFormats.STRING)
print(prompt_text)
```
Substitute any prompt variables with the values that you want to use.
```
filled_prompt_text = prompt_text.format(object='credit card')
```

Send the filled prompt text to a foundation model for inferencing.

generated_response = model.generate_text(prompt=filled_prompt_input, ...)

Parent topic: Python library