0 / 0
Supported foundation models in watsonx.ai
Last updated: Dec 18, 2024
Supported foundation models in watsonx.ai

A collection of open source and IBM foundation models are available for inferencing in IBM watsonx.ai. Find foundation models that best suit the needs of your generative AI application and your budget.

The foundation models that are available for inferencing from watsonx.ai are hosted in various ways:

Foundation models provided with watsonx.ai
IBM-curated foundation models that are deployed on multitenant hardware by IBM and are available for inferencing. You pay by tokens used. See Foundation models provided with watsonx.ai.
Deploy on demand foundation models
An instance of an IBM-curated foundation model that you deploy and that is dedicated for your inferencing use. Only colleagues who are granted access to the deployment can inference the foundation model. A dedicated deployment means faster and more responsive interactions without rate limits. You pay for hosting the foundation model by the hour. See Deploy on demand foundation models.
Custom foundation models
Foundation models curated by you that you import and deploy in watsonx.ai. The instance of the custom foundation model that you deploy is dedicated for your use. A dedicated deployment means faster and more responsive interactions. You pay for hosting the foundation model by the hour. See Custom foundation models.
Prompt-tuned foundation models
A subset of the available foundation models that can be customized for your needs by prompt tuning the model from the API or Tuning Studio. A prompt-tuned foundation model relies on the underlying IBM-deployed foundation model. You pay for the resources that you consume to tune the model. After the model is tuned, you pay by tokens used to inference the model. See Prompt-tuned foundation models.

If you want to deploy foundation models in your own data center, you can purchase watsonx.ai software. For more information, see Overview of IBM watsonx as a Service and IBM watsonx.governance software.

Deployment methods comparison

To help you choose the right deployment method, review the comparison table.

Table 1. Foundation model deployment methods
Deployment type Available from Deployment mechanism Hosting environment Billing method Deprecation policy
Foundation models provided with watsonx.ai • Resource hub>Pay per token
• Prompt Lab
Curated and deployed by IBM Multitenant hardware By tokens used Deprecated according to published lifecycle
Deploy on demand foundation models • Resource hub>Pay by the hour
• Prompt Lab
Curated and deployed by IBM at your request Dedicated hardware By hour deployed Your deployed model is not deprecated
Custom foundation models • Prompt Lab Curated and deployed by you Dedicated hardware By hour deployed Not deprecated
Prompt-tuned foundation models • Prompt Lab Tuned and deployed by you Multitenant hardware • Training is billed by CUH
• Inferencing is billed by tokens used
Deprecated when the underlying model is deprecated unless you add the underlying model as a custom foundation model

For details on how model pricing is calculated and monitored, see Billing details for generative AI assets.

Supported foundation models by deployment method

Various foundation models are available from watsonx.ai that you can either use immediately or that you can deploy on dedicated hardware for use by your organization.

Table 1a. Available foundation models by deployment method
Provider Provided with watsonx.ai
(Pay per token)
Deploy on demand
(Pay by the hour)
IBM granite-13b-chat-v2 (Deprecated)
granite-13b-instruct-v2
granite-7b-lab (Deprecated)
granite-8b-japanese
granite-3-8b-base
granite-20b-multilingual
granite-3-2b-instruct
granite-3-8b-instruct
granite-guardian-3-2b
granite-guardian-3-8b
granite-3b-code-instruct
granite-8b-code-instruct
granite-20b-code-instruct
granite-34b-code-instruct
granite-13b-chat-v2
granite-13b-instruct-v2
granite-20b-code-base-schema-linking
granite-20b-code-base-sql-gen
Google flan-t5-xl-3b
flan-t5-xxl-11b
flan-ul2-20b
flan-t5-xl-3b
flan-t5-xxl-11b
flan-ul2-20b
Meta llama-3-3-70b-instruct
llama-3-2-1b-instruct
llama-3-2-3b-instruct
llama-3-2-11b-vision-instruct
llama-3-2-90b-vision-instruct
llama-guard-3-11b-vision-instruct
llama-3-1-8b-instruct
llama-3-1-70b-instruct
llama-3-405b-instruct
llama-3-8b-instruct (Deprecated)
llama-3-70b-instruct (Deprecated)
llama-2-13b-chat (Deprecated)
llama-3-3-70b-instruct
llama-3-3-70b-instruct-hf
llama-2-13b-chat
llama-2-70b-chat
llama-3-8b-instruct
llama-3-70b-instruct
llama-3-1-8b
llama-3-1-8b-instruct
Mistral AI mistral-large
mixtral-8x7b-instruct-v01
pixtral-12b
mixtral-8x7b-base
mixtral-8x7b-instruct-v01
mistral-nemo-instruct-2407
BigScience mt0-xxl-13b mt0-xxl-13b
Code Llama codellama-34b-instruct
ELYZA, Inc elyza-japanese-llama-2-7b-instruct
Inception jais-13b-chat
SDAIA allam-1-13b-instruct

Foundation models provided with watsonx.ai

A collection of open source and IBM foundation models are deployed in IBM watsonx.ai. You can prompt these foundation models in the Prompt Lab or programmatically.

IBM foundation models provided with watsonx.ai

The following table lists the supported IBM foundation models that IBM provides for inferencing.

Use is measured in Resource Units (RU); each unit is equal to 1,000 tokens from the input and output of foundation model inferencing. For details on how model pricing is calculated and monitored, see Billing details for generative AI assets.

Some IBM foundation models are also available from third-party repositories, such as Hugging Face. IBM foundation models that you obtain from a third-party repository are not indemnified by IBM. Only IBM foundation models that you access from watsonx.ai are indemnified by IBM. For more information about contractual protections related to IBM indemnification, see the IBM Client Relationship Agreement and IBM watsonx.ai service description.

Table 2. IBM foundation models provided with watsonx.ai
Model name Input price
(USD/1,000 tokens)
Output price
(USD/1,000 tokens)
Context window
(input + output tokens)
More information
granite-13b-chat-v2 $0.0006 $0.0006 8,192 Model card
Website
Research paper
granite-13b-instruct-v2 $0.0006 $0.0006 8,192 Model card
Website
Research paper
Note: This foundation model can be prompt tuned.
granite-7b-lab $0.0006 $0.0006 8,192 Model card
Research paper (LAB)
granite-8b-japanese $0.0006 $0.0006 4,096 Model card
Website
Research paper
granite-20b-multilingual $0.0006 $0.0006 8,192 Model card
Website
Research paper
granite-3-2b-instruct $0.0001 $0.0001 131,072 Model card
Website
Research paper
granite-3-8b-instruct $0.0002 $0.0002 131,072 Model card
Website
Research paper
granite-guardian-3-2b $0.0001 $0.0001 8,192 Model card
Website
granite-guardian-3-8b $0.0002 $0.0002 8,192 Model card
Website
granite-3b-code-instruct $0.0006 $0.0006 128,000 Model card
Website
Research paper
granite-8b-code-instruct $0.0006 $0.0006 128,000 Model card
Website
Research paper
granite-20b-code-instruct $0.0006 $0.0006 8,192 Model card
Website
Research paper
granite-34b-code-instruct $0.0006 $0.0006 8,192 Model card
Website
Research paper

 

Third-party foundation models provided with watsonx.ai

The following table lists the supported third-party foundation models that are provided with watsonx.ai.

Use is measured in Resource Units (RU); each unit is equal to 1,000 tokens from the input and output of foundation model inferencing. For details on how model pricing is calculated and monitored, see Billing details for generative AI assets.

Table 3. Third-party foundation models provided with watsonx.ai
Model name Provider Input price
(USD/1,000 tokens)
Output price
(USD/1,000 tokens)
Context window
(input + output tokens)
More information
allam-1-13b-instruct National Center for Artificial Intelligence and Saudi Authority for Data and Artificial Intelligence $0.0018 $0.0018 4,096 Model card
codellama-34b-instruct Code Llama $0.0018 $0.0018 16,384 Model card
Meta AI Blog
elyza-japanese-llama-2-7b-instruct ELYZA, Inc $0.0018 $0.0018 4,096 Model card
Blog on note.com
flan-t5-xl-3b Google $0.0006 $0.0006 4,096 Model card
Research paper
Note: This foundation model can be prompt tuned.
flan-t5-xxl-11b Google $0.0018 $0.0018 4,096 Model card
Research paper
flan-ul2-20b Google $0.0050 $0.0050 4,096 Model card
UL2 research paper
Flan research paper
jais-13b-chat Inception, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), and Cerebras Systems $0.0018 $0.0018 2,048 Model card
Research paper
llama-3-3-70b-instruct Meta $0.0018 $0.0018 131,072 Model card
Meta AI blog
llama-3-2-1b-instruct Meta $0.0001 $0.0001 131,072 Model card
Meta AI blog
Research paper
llama-3-2-3b-instruct Meta $0.00015 $0.00015 131,072 Model card
Meta AI blog
Research paper
llama-3-2-11b-vision-instruct Meta $0.00035 $0.00035 131,072 Model card
Meta AI blog
Research paper
llama-3-2-90b-vision-instruct Meta $0.0020 $0.0020 131,072 Model card
Meta AI blog
Research paper
llama-guard-3-11b-vision Meta $0.00035 $0.00035 131,072 Model card
Meta AI blog
Research paper
llama-3-1-8b-instruct Meta $0.0006 $0.0006 131,072 Model card
Meta AI blog
llama-3-1-70b-instruct Meta $0.0018 $0.0018 131,072 Model card
Meta AI blog
llama-3-405b-instruct Meta $0.0050 $0.016 16,384 Model card
Meta AI blog
llama-3-8b-instruct Meta $0.0006 $0.0006 8,192 Model card
Meta AI blog
llama-3-70b-instruct Meta $0.0018 $0.0018 8,192 Model card
Meta AI blog
llama-2-13b-chat Meta $0.0006 $0.0006 4,096 Model card
Research paper
mistral-large Mistral AI $0.003 $0.01 32,768 Model card
Blog post for Mistral Large 2
mixtral-8x7b-instruct-v01 Mistral AI $0.0006 $0.0006 32,768 Model card
Research paper
mt0-xxl-13b BigScience $0.0018 $0.0018 4,096 Model card
Research paper
pixtral-12b Mistral AI $0.00035 $0.00035 128,000 Model card
Blog post for Pixtral 12B

 

Custom foundation models

In addition to working with foundation models that are curated by IBM, you can upload and deploy your own foundation models. After the custom models are deployed and registered with watsonx.ai, you can create prompts that inference the custom models from the Prompt Lab and from the watsonx.ai API.

To learn more about how to upload, register, and deploy a custom foundation model, see Deploying a custom foundation model.

Deploy on demand foundation models

Choose a foundation model from a set of IBM-curated models to deploy for the exclusive use of your organization.

For more information about how to deploy a foundation model on demand, see Deploying foundation models on-demand.

Note: Foundation models that you can deploy on demand are available only in the Dallas data center.

Deploy on demand foundation models from IBM

The following table lists the IBM foundation models that are available for you to deploy on demand.

Some IBM foundation models are also available from third-party repositories, such as Hugging Face. IBM foundation models that you obtain from a third-party repository are not indemnified by IBM. Only IBM foundation models that you access from watsonx.ai are indemnified by IBM. For more information about contractual protections related to IBM indemnification, see the IBM Client Relationship Agreement and IBM watsonx.ai service description.

Table 4. IBM foundation models available to deploy on demand in watsonx.ai
Model name Price per hour in USD Model hosting category Context window
(input + output tokens)
granite-13b-chat-v2 $5.22 Small 8,192
granite-13b-instruct-v2 $5.22 Small 8,192
granite-20b-code-base-schema-linking $5.22 Small 8,192
granite-20b-code-base-sql-gen $5.22 Small 8,192
granite-3-8b-base $5.22 Small 4,096

 

Deploy on demand foundation models from third-parties

The following table lists the third-party foundation models that are available for you to deploy on demand.

Table 5. Third-party foundation models available to deploy on demand in watsonx.ai
Model name Provider Price per hour in USD Model hosting category Context window
(input + output tokens)
flan-t5-xl-3b Google $5.22 Small 4,096
flan-t5-xxl-11b Google $5.22 Small 4,096
flan-ul2-20b Google $5.22 Small 4,096
llama-2-13b-chat Meta $5.22 Small 4,096
llama-2-70b-chat Meta $20.85 Large 4,096
llama-3-8b-instruct Meta $5.22 Small 8,192
llama-3-70b-instruct Meta $20.85 Large 8,192
llama-3-1-8b Meta $5.22 Small 131,072
llama-3-1-8b-instruct Meta $5.22 Small 131,072
llama-3-3-70b-instruct Meta $10.40 Medium 8,192
llama-3-3-70b-instruct-hf Meta $20.85 Large 8,192
mixtral-8x7b-base Mistral AI $10.40 Medium 32,768
mixtral-8x7b-instruct-v01 Mistral AI $10.40 Medium 32,768
mistral-nemo-instruct-2407 Mistral AI $5.22 Small 131,072
mt0-xxl-13b BigScience $5.22 Small 4,096

 

Prompt-tuned foundation models

You can customize the following foundation models by prompt tuning them in watsonx.ai:

For more information, see Tuning Studio.

Learn more

Parent topic: Developing generative AI solutions

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more