Alternatively, you can use graphical tools from the watsonx.ai UI to inference foundation models. See Prompt Lab.
Inference types
Copy link to section
You can prompt a foundation model by using one of the following text generation methods:
Infer text: Waits to return the output that is generated by the foundation model all at one time.
Infer text event stream: Returns the output as it is generated by the foundation model. This method is useful in conversational use cases, where you want a chatbot or virtual assistant to respond to a user in a fluid way that mimics a real
conversation.
The method that you use to inference a foundation model differs depending on whether the foundation model is provided with watsonx.ai or is associated with a deployment.
To inference a foundation model that is deployed by IBM in watsonx.ai, use the Text generation method.
To inference a tuned foundation model, a custom foundation model, or a deploy on demand foundation model, use the Deployments>Infer text method.
The {model_id} is not required with this type of request because only one model is supported by the deployment.
Applying AI guardrails when inferencing
Copy link to section
When you prompt a foundation model by using the API, you can use the moderations field to apply AI guardrails to foundation model input and output. For more information, see Removing harmful language from model input and output.
Inferencing with a prompt template
Copy link to section
You can inference a foundation model with input text that follows a pattern that is defined by a prompt template.