What's new for watsonx as a Service on IBM Cloud

Last updated: Feb 21, 2025

Check back each week to learn about new features and updates for IBM watsonx.ai and IBM watsonx.governance on IBM Cloud.

For information about IBM watsonx.governance on AWS, see What's new for IBM watsonx.governance on AWS.

Tip: Occasionally, you must take a specific action after an update. To see all required actions, search this page for “Action required”.

Week ending 21 February 2025

Deploy custom foundation models with extra small hardware specification

20 February 2025

To save resources, you can deploy custom foundation models with extra small hardware specification. Use the gpu_xs hardware specification for the hardware_request parameter when you deploy your custom foundation model with the REST API. For more information, see Creating a deployment for a custom foundation model.

The Granite Code 3 billion parameter foundation model is available for you to deploy on demand

20 February 2025

The granite-3b-code-instruct foundation model from IBM is available as a deploy on demand foundation model. For more information, see Supported foundation models.

To learn more about deploying a foundation model on demand, see Deploying foundation models on demand.

Week ending 14 February 2025

Inference deploy on demand and custom foundation models from the Toronto region

13 February 2025

You now have more ways to work with foundation models from projects that are hosted in the Toronto region.

Upload and deploy custom foundation models.
Choose from a set of popular foundation models to deploy on dedicated hardware for the exclusive use of your organization.

To learn more, read the following resources:

Integrate an existing hosted deployment of IBM OpenPages with watsonx.governance

13 February 2024

In addition to optionally integrating watsonx.governance with Governance console deployed with OpenPages as a Service, you can now integrate watsonx.governance with a hosted deployment of IBM OpenPages, either on a managed cloud or a traditional on-premised deployment. Integration with a hosted version of OpenPages requires the Essentials plan for watsonx.governance.

For plan details, see IBM watsonx.governance offering plan options.
For more information on synching use case data with watsonx.governance, see Managing AI uses cases
For integration instructions, see Integrating with watsonx.governance

Deploy models converted from scikit-learn and XGBoost to ONNX format

13 February 2024

You can now deploy machine learning and generative AI models that are converted from scikit-learn and XGBoost to ONNX format and use the endpoint for inferencing. For more information, see Deploying models coverted to ONNX format.

Work with new deploy on demand Granite Code models in watsonx.ai

13 February 2025

Use the following Granite Code foundation models from IBM for coding tasks such as writing, converting, and correcting programmatic code:

granite-8b-code-instruct
granite-20b-code-instruct
granite-34b-code-instruct

For more information about pricing, see Supported foundation models. To learn more about deploying a foundation model on demand, see Deploying foundation models on demand.

Updated SPSS Modeler tutorial videos

11 February 2025

Watch and learn about SPSS Modeler by viewing the updated videos in the SPSS Modeler tutorials.

Week ending 7 February 2025

Preview the latest IBM Granite foundation model in the Dallas region

7 February 2025

Try out a technical preview of the granite-3-2-8b-instruct-preview-rc foundation model now available in the Dallas region. The Granite 3.2 preview model adds new reasoning capabilities in a novel way. The reasoning function is configurable, which means you can enable reasoning only for task where explanatory information is useful in the output. For more information, see Supported foundation models.

Data forecasting with IBM Granite time series foundation models is now generally available

6 February 2025

Use the time series forecast method of the watsonx.ai API to pass historical data observations to an IBM Granite time series foundation model that can forecast future values with zero-shot inferencing.

For more information about how to use the forecast method of the watsonx.ai API, see Forecast future data values.
For information about pricing, see Billing details for generative AI assets.

IBM watsonx.ai is available in the Toronto region

6 February 2025

Watsonx.ai is now generally available in the Toronto data center and Toronto can be selected as the preferred region when signing-up. Use a subset of the provided foundation models for inferencing and embedding models for generating text embeddings and reranking passages.

For more information about the foundation models and product features that are available in the Toronto region, see Regional availability for services and features.

Inference deploy on demand and custom foundation models from the Sydney region

6 February 2025

You now have more ways to work with foundation models from projects that are hosted in the Sydney region.

Upload and deploy custom foundation models.
Choose from a set of popular foundation models to deploy on dedicated hardware for the exclusive use of your organization.

To learn more, read the following resources:

Mistral Large 2 foundation model is available for you to deploy on demand

6 February 2025

The mistral-large-instruct-2407 foundation model from Mistral AI is available as a deploy on demand foundation model. An additional hourly fee is applied when accessing this model. For more information about pricing, see Supported foundation models.

To learn more about deploying a foundation model on demand, see Deploying foundation models on demand.

Deploy and inference new DeepSeek-R1 distilled models on demand in watsonx.ai

3 February 2025

You can now deploy distilled variants of DeepSeek-R1 models on demand in watsonx.ai on IBM Cloud. The distilled DeepSeek-R1 model variants are based on Llama models that are fine tuned by using training data generated by the DeepSeek-R1 model.

For details, see Supported foundation models.

To learn more about deploying a foundation model on demand from the Resource hub or REST API, see Deploying foundation models on demand.

Default Inventory replaces Platform Asset Catalog in watsonx.governance

3 February 2025

A default inventory is now available to store watsonx.governance artifacts including AI use cases, third-party models, attachments, and reports. The default inventory replaces any previous dependency on Platform access catalog or IBM Knowledge Catalog for storing governance artifacts.

For more information, see Setting up the default inventory.

Evaluation Studio available on Sydney data center

3 February 2025

With Evaluation Studio, you can evaluate and compare your generative AI assets with quantitative metrics and customizable criteria that fit your use cases. Evaluate the performance of multiple assets simultaneously and view comparative analyses of results to identify the best solutions.

For more information, see Comparing AI assets with Evaluation Studio.

Week ending 31 January 2025

Build and deploy AI agents in the new Agent Lab (beta)

30 January 2025

You can now build and deploy AI agents to make your applications more flexible and dynamic by using the Agent Lab user interface. You can configure the agent to make decisions and perform tasks by using an agent framework, a foundation model, and external tools that you specify in the agent's settings.

For more information, see Agent Lab.

When you use the Agent Lab to build your agentic AI applications in watsonx.ai, your applications are deployed as AI services. You can choose to deploy your solution directly from the user interface or export your solution in an editable notebook in Python that deploys the AI service.

For more information, see Deploying AI services.

A powerful new Mistral Large 2 foundation model is available for you to deploy on demand

30 January 2025

Deploy the mistral-large-instruct-2411 foundation model from Mistral AI on dedicated hardware for the exclusive use of your organization. This latest foundation model improves on the Mistral-Large-Instruct-2407 foundation model by adding better handling of long prompt contexts, system prompt instruction-following, and function calling.

Unlike other deploy on demand foundation models, there is an additional hourly fee for accessing the hosted the mistral-large-instruct-2411 foundation model. For more information about pricing, see Supported foundation models.

To learn more about deploying a foundation model on demand from the Resource hub or REST API, see Deploying foundation models on demand.

Inference the Mistral Small 3 foundation model in the Frankfurt region

30 January 2025

The mistral-small-24b-instruct-2501 foundation model from Mistral AI is available and hosted on multitenant hardware ready for use. The Mistral Small 3 foundation model is a great choice for chat workflows due to these features:

Agentic capabilities with native function calling and JSON output generation.
State-of-the-art conversational and reasoning capabilities.
Maintains strong adherence and support for system prompts.
Supports dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, and Polish.

For more information, see Supported foundation models.

Deploy and inference DeepSeek-R1 distilled models with watsonx.ai

29 January 2025

You can now use the distilled variants of DeepSeek-R1, a powerful open-sourced reasoning model, to securely deploy and inference DeepSeek-R1 models with watsonx.ai on IBM Cloud, enabling developers to accelerate the development of AI-powered solutions. DeepSeek-R1 model can be deployed as a custom foundation model with watsonx.ai.

For more information, see Deploying custom foundation models.

Lower price for inferencing the Llama 3.3 70b Instruct foundation model

29 January 2025

The price for inferencing the llama-3-3-70b-instruct foundation model from Meta decreased from $0.0018 to $0.00071 USD per 1,000 tokens. The price change applies to all regions where the foundation model is available.

For more information, see Supported foundation models.

Week ending 24 January 2025

General availabilty of AutoAI for RAG experiments

23 January 2025

AutoAI for RAG is now fully supported for automating the search for an optimized retrieval-augmented generation pattern for your applications. This update includes the following new features:

Explore AutoAI for RAG using sample data and a guided tour to learn the process.
If you create an experiment using a Milvus vector store, you can now deploy a RAG pattern as an AI service so you can access the endpoint from a deployment space for inferencing.

For details, see Automating a RAG pattern with AutoAI.

Running AutoAI RAG experiments incurs billing charges as follows:

Capacity Unit Hour (CUH) charges for the runtime of the experiments.
Resource Unit (RU) charged for all embedding and generative AI model inferencing calls made during the experiment run. One Resource Unit (RU) is equal to 1,000 tokens.

For plan information and billing details, see watsonx.ai Runtime plans.

Deploy AI services with CPDCTL

23 January 2024

You can now use the Cloud Pak for Data (CPDCTL) command-line interface to deploy your generative AI solutions with AI services programmatically. CPDCTL is a command-line tool for deploying and managing AI services on the IBM Cloud Pak for Data (CPD) platform. It provides a simple and streamlined way to deploy AI services, eliminating the need for manual configuration and reducing the risk of errors.

For more information, see Deploying AI services with Cloud Pak for Data Command-Line Interface (CPDCTL).

Deploy AI services with AutoAI for RAG

23 January 2024

When you use AutoAI to create a generative AI solution that uses a RAG pattern, you can now deploy your solution directly from the AutoAI experiment builder as an AI service. For more information, see Deploying AI services with tools.

Deploy models converted from CatBoost and LightGBM to ONNX format

23 January 2024

You can now deploy machine learning and generative AI models that are converted from CatBoost and LightGBM to ONNX format and use the endpoint for inferencing. These models can also be adapted to dynamic axes. For more information, see Deploying models coverted to ONNX format.

Deploy popular foundation models on demand in watsonx.ai

22 January 2025

You can now deploy the following foundation models on dedicated hardware in the Dallas region for the exclusive use of your organization:

llama-3-1-70b-instruct
granite-7b-lab

For more information about these models, such as pricing and supported context window lengths, see Deploy on demand foundation models.

To learn more about deploying a foundation model on demand from the Resource hub or REST API, see Deploying foundation models on demand.

Several Llama foundation models are now available in more regions

22 January 2025

The following foundation models from Meta are available in more regions:

llama-3-3-70b-instruct: Now available from Tokyo and London in addition to Dallas and Frankfurt.
llama-3-2-11b-vision-instruct: Now available from Frankfurt and London in addition to Dallas, Tokyo, and Sydney.

For more information about the regional availability of foundation models, see Regional availability on IBM Cloud.

The Llama 3.1 Instruct 70b and 8b foundation models are deprecated

22 January 2025

The following foundation models from Meta are deprecated. Revise any prompts that use these foundation models.

llama-3-1-70b-instruct
- Deprecation date: 22 January 2025
- Withdrawal date: 30 May 2025
- Alternative model: llama-3-3-70b-instruct or llama-3-2-90b-vision-instruct
llama-3-1-8b-instruct
- Deprecation date: 22 January 2025
- Withdrawal date: 30 May 2025
- Alternative model: llama-3-2-11b-vision-instruct

For details about deprecation and withdrawal, see Foundation model lifecycle. For more information about alternative models, see Supported foundation models.

If you want to continue working with these foundation models, you can deploy the models on demand for your exclusive use. For more information, see Deploy on demand foundation models.

Week ending 17 January 2025

Use new IBM Granite embedding models in watsonx.ai

16 January 2025

You can now use the following Granite embedding models provided by IBM in watsonx.ai:

granite-embedding-107m-multilingual
granite-embedding-278m-multilingual

Use the new embedding models to generate high-quality text embeddings for input in the form of a query, passage, or document in multiple languages. For details, see Supported encoder models and Vectorizing text.

A modification was made to the Granite Guardian models

16 January 2025

You can now inference the latest version of the Granite Guardian foundation models from IBM on watsonx.ai in the Dallas and Sydney data centers.

The latest 3.1 versions of the models now support a 128,000 token context length and trained with additional synthetic data to improve performance for risks related to hallucination and jailbreak. For details, see Supported foundation models.

The granite-20b-multilingual and codellama-34b-instruct-hf foundation models are deprecated

15 January 2025

The following foundation models are deprecated. Revise any prompts that use these foundation models.

granite-20b-multilingual
- Deprecation date: 15 January 2025
- Withdrawal date: 16 April 2025
- Alternative model: granite-3-8b-instruct
codellama-34b-instruct-hf
- Deprecation date: 15 January 2025
- Withdrawal date: 31 March 2025
- Alternative model: llama-3-3-70b-instruct

For details about deprecation and withdrawal, see Foundation model lifecycle. For more information about alternative models, see Supported foundation models.

New model architectures available to deploy custom foundation models

15 January 2024

You can now deploy custom foundation models with the following architectures in watsonx.ai:

exaone
gemma
gemma2
granite
mt5
nemotron
olmo
persimmon
phi
phi3
qwen
qwen2

For more information, see Planning to deploy a custom foundation model.

Beta concluding for AutoAI RAG experiments on January 23, 2025

13 January 2025

Following the conclusion of the beta phase, running AutoAI RAG experiments will incur billing charges as follows:

Capacity Unit Hour (CUH) charges for the runtime of the experiments.
Resource Unit (RU) charged for embedding the grounding document and for inferencing the generative AI models. One Resource UNit (RU) is equal to 1,000 tokens.

For plan information and billing details, see watsonx.ai Runtime plans.

Week ending 20 December 2024

Deploy models converted to ONNX format

20 December 2024

You can now deploy machine learning and generative AI models that are converted to ONNX format and use the endpoint for inferencing. These models can also be adapted to dynamic axes. For more information, see Deploying models coverted to ONNX format.

Deploy multi-source SPSS Modeler flows

20 December 2024

You can now create deployments for SPSS Modeler flows that use multiple input streams to provide data to the model. For more information, see Deploying multi-source SPSS Modeler flows.

Week ending 13 December 2024

Modifications to the Granite 3 Instruct foundation models are introduced

13 December 2024

Modifications were made to the following IBM foundation models:

granite-3-2b-instruct
granite-3-8b-instruct

With the latest modifications, the Granite 3.1 Instruct foundation models now provide better support for coding tasks and intrinsic functions for agents. The supported context window length for these foundation models increased from 4,096 tokens to 131,072 tokens. Although the model IDs for the Granite Instruct models remain the same, the model weights are updated.

For more information, see Supported foundation models.

No-code solution for searching for a RAG pattern with AutoAI (beta)

12 December 2024

You can now automate the search for the optimal RAG pattern for your use case from the AutoAI user interface. Load the document collection and test questions, choose a vector database, and run the experiment for a fast path approach to finding a RAG pattern. You can also review and modify configuration settings for the experiment. Compare the patterns generated by the experiment and save the best pattern as an auto generated notebook or notebooks saved to your project.

For details, see Automating a RAG pattern with AutoAI.

Deploy AI services with templates

12 December 2024

You can deploy your AI services by using predefined templates. AI service templates provide a standardized way to deploy AI services by offering a pre-defined structure and configuration for deploying AI models. These templates are pre-built, deployable units of code that encapsulate the programming logic of generative AI applications.

AI service templates automate tasks like creating deployments, generating metadata, and building extensions, enabling developers to focus on the core logic of their application. They provide a flexible way to deploy AI services, supporting multiple inputs and customization.

For more information, see Deploying AI services with templates.

The latest Llama foundation model is available for you to deploy on demand

12 December 2024

You can deploy the Meta Llama 3.3 70B Instruct multilingual foundation model on dedicated hardware for the exclusive use of your organization. The newest foundation model from Meta has capabilities that are similar to the larger llama-3-405b-instruct model, but is smaller in size and is skilled at coding, step-by-step reasoning, and tool-calling in particular. You can deploy the full model (llama-3-3-70b-instruct-hf) or a quantized version (llama-3-3-70b-instruct) which requires fewer resources to host.

To learn more about deploying a foundation model on demand from the Resource hub or REST API, see Deploying foundation models on demand.

Deploy foundation models on-demand with Python client library

12 December 2024

You can now deploy your foundation models on-demand by using the watsonx.ai Python client library. By using this approach, you can access the capabilities of these powerful foundation models without the need for extensive computational resources. Foundation models that you deploy on-demand are hosted in a dedicated deployment space where you can use these models for inferencing.

For more information, see Deploying foundation models on-demand.

Updated SPSS Modeler tutorials

11 December 2024

Get hands-on experience with SPSS Modeler by trying the 15 updated SPSS Modeler tutorials.

Comparing AI assets with Evaluation Studio

12 December 2024

For more information, see Comparing AI assets with Evaluation Studio.

Enhancements to Governance console

12 December 2024

Enhancements to the watsonx.governance Model Risk Governance solution

This release includes the following enhancements:

The new AI Model Onboarding Risk Identification questionnaire template is used during the model onboarding process to help identify the risks associated with a model. This questionnaire template is used in the Foundation Model Onboarding workflow.
The new AI Use Case Risk Identification questionnaire template is used to help identify the risks associated with AI use cases. This questionnaire template is used in the Use Case Review workflow. This new questionnaire is intended to replace the AI Risk Identification Questionnaire
The new AI Use Case and Model Risk Identification questionnaire template is used to help identify the risks that are associated with the combination of an AI use case and a model. This questionnaire template is used in the Use Case Development and Documentation workflow.
The AI Assessment Workflow is now disabled by default. It is replaced by the Questionnaire Assessment Workflow. You can now set questionnaire templates directly in the Use Case workflow.
The workflows, views, and dashboards were updated.

For more information, see Solution components in Governance console.

Bug fixes and security fixes

Bug fixes and security fixes were applied.

For more information, see New features in 9.0.0.5.

IBM watsonx.governance is available in the Sydney region

9 December 2024

IBM watsonx.governance is now generally available in the Sydney data center. You can select Sydney as your preferred region when signing-up.

For more information about the product features that are available in the Sydney region, see Regional availability for services and features.

Week ending 6 December 2024

Deploy foundation models on demand in the Dallas region

6 December 2024

Choose from a curated collection of foundation models that you can deploy on dedicated hardware for the exclusive use of your organization. A dedicated deployment means more responsive interactions when you inference foundation models. Deploy on-demand foundation models are billed by the hour. For more information, see Supported foundation models and Billing details for generative AI assets.

To learn more about deploying a foundation model on demand from the Resource hub or REST API, see Deploying foundation models on demand.

Inference the latest Llama foundation model from Meta in the Dallas and Frankfurt regions

6 December 2024

The Meta Llama 3.3 70B Instruct multilingual foundation model is available for interencing in the Dallas and Frankfurt regions. The llama-3-3-70b-instruct foundation model is skilled at coding, step-by-step reasoning, and tool-calling. With performance that rivals that of the 405b model, the Llama 3.3 foundation model update is a great choice for developers. See the announcement from IBM.

For more information, see Supported foundation models.

Review benchmarks to compare foundation models

5 December 2024

Review foundation model benchmarks to learn about the capabilities of the available foundation models before you try them out. Compare how various foundation models perform on the tasks that matter most for your use case. For more information, see Foundation model benchmarks.

Microsoft Excel files are deprecated for OPL models in Decision Optimization

5 December 2024

Microsoft Excel workbook (.xls and .xlsx) files are now deprecated for direct input and output in Decision Optimization OPL models. To connect to Excel files, use a data connector instead. The data connector transforms your Excel file into a .csv file. For more information, see Referenced data.

New sample notebooks for deploying models converted to ONNX format

3 December 2024

For more information, see watsonx.ai Runtime Python client samples and examples.

The llama-3-8b-instruct and llama-3-70b-instruct foundation models are deprecated

2 December 2024

The following foundation models are deprecated. Revise any prompts that use these foundation models.

llama-3-8b-instruct
- Deprecation date: 2 December 2024
- Withdrawal date: 3 February 2025
- Alternative model: llama-3-1-8b-instruct, llama-3-2-11b-vision-instruct
llama-3-70b-instruct
- Deprecation date: 2 December 2024
- Withdrawal date: 3 February 2025 (31 March 2025 in Sydney)
- Alternative model: llama-3-1-70b-instruct, llama-3-2-90b-vision-instruct

For details about deprecation and withdrawal, see Foundation model lifecycle. For more information about alternative models, see Supported foundation models.

Week ending 29 November 2024

Improved documentation on write options for Data Refinery

28 November 2024

The write options and table options for exporting data flows depends on your connection. These options are now explained so that you are better guided to select your target table options. For more information, see Target connection options for Data Refinery.

Week ending 22 November 2024

New watsonx Developer Hub to start coding fast

21 October 2024

Check out the new Developer Hub to find everything that you need for coding your generative AI solution:

Make your first API request to inference a foundation model in watsonx.ai.
Find the right foundation models and code libraries for your AI applications.
Understand watsonx.ai capabilities and copy code snippets in Curl, Node.js, or Python.
Learn how to build generative AI applications and solutions with detailed guides.
Join communities to find resources, answers, and engagement with other users.

Go to watsonx Developer Hub.

Component services of IBM watsonx.ai were renamed

21 November 2024

The following services were renamed:

Watson Machine Learning is now named watsonx.ai Runtime
Watson Studio is now named watsonx.ai Studio

Some videos, notebooks, and code samples might continue to refer to these services by their former names.

IBM watsonx.ai is available in the Sydney region

21 November 2024

Watsonx.ai is now generally available in the Sydney data center and Sydney can be selected as the preferred region when signing-up.

For more information about the foundation models and product features that are available in the Sydney region, see Regional availability for services and features.

Use IBM Granite time series foundation models and the watsonx.ai API to forecast future values (beta)

21 November 2024

Use the time series API to pass historical data observations to an IBM Granite time series foundation model that can forecast future values with zero-shot inferencing. The time series forecast method of the watsonx.ai API is available as a beta feature. For more information, see Forecast future data values.

Use watsonx.ai text embedding models from the Elasticsearch inference API

21 November 2024

The Elasticsearch version 8.16.0 release added support for creating an inference endpoint that uses a watsonx.ai foundation model for text embedding tasks.

For more information, see Vectorizing text by using the API.

Promote SPSS Modeler flows to deployment spaces

19 November 2024

You can now directly promote SPSS Modeler flows from projects to deployment spaces without having to export the project and then import it into the deployment space. For more information, see Promoting SPSS Modeler flows and models.

Week ending 15 November 2024

Use IBM watsonx.ai demo chat app without trial restrictions by linking accounts

15 November 2024

You can now use your IBM watsonx.ai demo account chat app without token usage or time limit restrictions by linking your demo account to your paid IBM Cloud watsonx.ai account. For details, see Linking the IBM watsonx.ai demo and watsonx.ai accounts.

The watsonx.ai Node.js package is available from LangChain

11 November 2024

The watsonx.ai Node.js package is available for use from the LangChain JavaScript community library. The integration supports watsonx.ai functions such as inferencing foundation models, generating text embeddings, and handling chat exchanges that include image-to-text and tool-calling capabilities. With the LangChain integration, you can call these watsonx.ai capabilities by using consistent interfaces that make it easier to swap between providers to compare offerings and find the best solution for your needs.

For more information, see Node.js SDK.

Task credentials are now required to deploy assets and run jobs from a deployment space

11 November 2024

To improve the security for running deployment jobs, you must enter your task credentials to deploy the following assets from a deployment space:

Prompt templates
AI services
Models
Python functions
Scripts

Additionally, you must enter your task credentials to create the following deployments from your deployment space:

Online
Batch

You must also use your task credentials to create and manage deployment jobs from your deployment space.

To learn how to set up your task credentials and generate an API key, see Adding task credentials.

Week ending 8 November 2024

Deploy generative AI applications with AI services

7 November 2024

You can now use AI services in watsonx.ai to deploy your applications. An AI service is a deployable unit of code that you can use to capture the logic of your generative AI use cases. While Python functions are the traditional way to deploy machine learning assets, AI services offer a more flexible option to deploy code for generative AI applications, such as streaming. When your AI services are successfully deployed, you can use the endpoint for inferencing from your application.

For more information, see Deploying AI services.

The granite-13b-chat-v2, llama2-13b-dpo-v7, and mt0-xxl-13b foundation models are deprecated

4 November 2024

The following foundation models are deprecated. Revise any prompts that use these foundation models.

granite-13b-chat-v2

Deprecation date: 4 November 2024
Withdrawal date: 3 February 2025
Alternative model: granite-3-8b-instruct

llama2-13b-dpo-v7

Deprecation date: 4 November 2024
Withdrawal date: 4 December 2024
Alternative model: llama-3-1-8b-instruct

mt0-xxl-13b

Deprecation date: 4 November 2024
Withdrawal date: 4 December 2024
Alternative models: llama-3-1-8b-instruct, llama-3-2-11b-vision-instruct

For details about deprecation and withdrawal, see Foundation model lifecycle. For more information about alternative models, see Supported foundation models.

Week ending 1 November 2024

New third-party all-minilm-l6-v2 embedding model is available in watsonx.ai

29 October 2024

The all-minilm-l6-v2 text embedding model from the open source natural language processing (NLP) and computer vision (CV) community is now available for use from the text embedding method of the watsonx.ai API. Use the model to convert text into text embedding vectors that are suitable for use in text matching and retrieval tasks. For model details, see the following topics:

Lower price for inferencing the Mistral Large foundation model

29 October 2024

The price for input that you submit to the Mistral Large foundation model decreased from $0.01 to $0.003 USD per 1,000 tokens. The price for the output that is generated by the foundation model did not change; the price for output tokens remains $0.01 USD/1,000 tokens. The price change applies to all regions where the mistral-large foundation model is available.

For more information, see Supported foundation models.

Deprecation of IBM Runtime 23.1

28 October 2024

IBM Runtime 23.1 is deprecated. Beginning November 21, 2024, you cannot create new notebooks or custom environments by using 23.1 runtimes. Also, you cannot create new deployments with software specifications that are based on the 23.1 runtime. To ensure a seamless experience and to leverage the latest features and improvements, switch to IBM Runtime 24.1.

For information about changing environments, see Changing notebook environments.
For details on deployment frameworks, see Managing frameworks and software specifications.

Simplify complex business documents with the text extraction API

28 October 2024

The text extraction method is now generally available in the watsonx.ai REST API. Leverage document understanding technology developed by IBM to simplify your complex business documents so that they can be processed by foundation models as part of a generative AI workflow. The text extraction API extracts text from document structures such as images, diagrams, and tables that foundation models often cannot interpret correctly. For more information, see Extracting text from documents.

The API is available in all regions to users of paid plans. For pricing details, see the Document text extraction rate table.

Week ending 25 October 2024

Compare tables in Decision Optimization experiments to see differences between scenarios

23 October 2024

You can now compare tables in a Decision Optimization experiment in either the Prepare data or Explore solution view. This comparison can be useful to see data value differences between scenarios displayed next to each other. Screenshot showing table comparison in Decision Optimization
For more information, see Compare scenario tables.

New Granite 3.0 models are available in watsonx.ai

21 October 2024

You can now inference the following generation 3.0 Granite foundation models provided by IBM from watsonx.ai:

Granite Instruct models in all regions: Use the new instruct-tuned, lightweight, and open-source language models for tasks like summarization, problem-solving, text translation, reasoning, coding, and function-calling tasks. Work with the following model variants:
- granite-3-2b-instruct
- granite-3-8b-instruct
Granite Guardian models in the Dallas region: Use the new Granite Guardian models, which are fine-tuned Granite Instruct models, designed to detect risks in prompts and responses. Work with the following model variants:
- granite-guardian-3-2b
- granite-guardian-3-8b

For details, see Supported foundation models.

Enhance search and retrieval tasks with the text rerank API

21 October 2024

The text rerank method is generally available in the watsonx.ai REST API. Use this new API method, together with reranker foundation models, such as the newly-supported ms-marco-minilm-l-12-v2 model, to reorder a set of document passages based on their similarity to a specified query. Reranking is a useful way to add precision to your answer-retrieval workflows. For more information, see Reranking document passages.

New Pixtral 12B model is available in the Frankfurt and London regions

21 October 2024

You can now use the Pixtral 12B foundation model from Mistral AI on watsonx.ai in the Frankfurt and London data centers.

Pixtral 12B is a natively multimodal model with image-to-text and text-to-text capabilities that was trained with interleaved image and text data. The foundation model supports variable image sizes and excels at instruction-following tasks. For details, see Supported foundation models.

Week ending 18 October 2024

Account resource scoping is enabled by default

17 October 2024

The Resource scope setting for your account is now set to ON by default. However, if you previously set the value for the Resource scope setting to either ON or OFF, the current setting is not changed.

When resource scoping is enabled, you can’t access projects that are not in your currently selected IBM Cloud account. If you belong to more than one IBM Cloud account, you might not see all your projects listed together. For example, you might not see all your projects on the All projects page. You must switch accounts to see the projects in the other accounts.

A Granite Code foundation model is available in the Frankfurt region

15 October 2024

The granite-20b-code-instruct foundation model from IBM is designed to respond to coding-related instructions. You can use the foundation model in projects that are hosted in the Frankfurt data center to help you with coding tasks and for building coding assistants. For more information about the model, see Supported foundation models.

Week ending 11 October 2024

New licensing benefit

10 October 2024

You can now Bring Your Own License (BYOL) to apply on-premises licensing benefits to IBM watsonx.ai and IBM watsonx.governance.

For more information, see Activating Bring Your Own License (BYOL) to SaaS.

Analyze Japanese text data in SPSS Modeler with Text Analytics

9 October 2024

You can now use the Text Analytics nodes in SPSS Modeler, such as the Text Link Analysis node and Text Mining node, to analyze text data written in Japanese.

Build conversational workflows with the watsonx.ai chat API

8 October 2024

Use the watsonx.ai chat API to add generative AI capabilities, including agent-driven calls to third-party tools and services, into your applications.

For more information, see the following topics:

New software specification for custom foundation models

7 October 2024

You can now use a new software specification watsonx-cfm-caikit-1.1 with your custom foundation model deployments. The specification is based on the vLLM library and is better suited for the latest decoder-only large language models. For more information on the vLLM library, see vLLM For information on using the specification with a custom foundation model, see Planning to deploy a custom foundation model.

The granite-7b-lab and llama3-llava-next-8b-hf foundation models are deprecated

7 October 2024

The granite-7b-lab foundation model is deprecated and will be withdrawn on 7 January 2025. Revise any prompts that use this foundation model.

Deprecation date: 7 October 2024
Withdrawal date: 7 January 2025
Alternative model: granite-3-8b-instruct

The llama3-llava-next-8b-hf multimodal foundation model is also deprecated and will be withdrawn on 7 November 2024. You can now use one of the newly-released Llama 3.2 vision models for image-to-text generation tasks.

Deprecation date: 7 October 2024
Withdrawal date: 7 November 2024
Alternative model: llama-3-2-11b-vision-instruct

For details about deprecation and withdrawal, see Foundation model lifecycle. For more information about alternative models, see Supported foundation models.

Week ending 4 October 2024

Updated environments and software specifications

3 October 2024

The Tensorflow and Keras libraries that are included in IBM Runtime 23.1 are now updated to their newer versions. This might have an impact on how code is executed in your notebooks. For details, see Library packages included in watsonx.ai Studio (formerly Watson Studio) runtimes.

Runtime 23.1 will be discontinued in favor of IBM Runtime 24.1 later this year. To avoid repeated disruption we recommend that you switch to IBM Runtime 24.1 now and use related software specifications for deployments.

For information about changing environments, see Changing notebook environments.
For details on deployment frameworks, see Managing frameworks and software specifications.

Availability of watsonx.governance plan in Frankfurt region and deprecation of OpenScale legacy plan

3 October 2024

The watsonx.governance legacy plan to provision Watson OpenScale in the Frankfurt region is deprecated. IBM Watson OpenScale will no longer be available for new subscription or to provision new instances. For OpenScale capabilities, subscribe to the watsonx.governance Essentials plan, which is now available in Frankfurt as well as Dallas.

To view plan details, see watsonx.governance plans.
To get started, see Setting up watsonx.governance.

Notes:

Existing legacy plan instances will continue to operate and will be supported until the End of Support date which remains to be determined.
Existing customers on IBM Watson OpenScale can continue to open support tickets using IBM Watson OpenScale.

Week ending 27 September 2024

Llama 3.2 foundation models, including multimodal 11B and 90B models, are available

25 September 2024

Today's release makes the following foundation models from Meta AI available from the Dallas region:

Llama 3.2 instruct models: Versatile large language models that support large inputs (128,000 token context window length) and are lightweight and efficient enough, at 1B and 3B parameters in size, to fit onto a mobile device. You can use these models to build highly personalized, on-device agents.
Llama 3.2 vision models: Fine tuned models that are built for image-in, text-out use cases such as document-level understanding, interpretation of charts and graphs, and captioning of images.
Llama Guard vision model: Powerful guardrail model designed for filtering harmful content.

For more information, see Supported foundation models.

Enhancements to Governance console

25 September 2024

This release includes enhancements and bug fixes.

Custom tabs on the dashboard

The dashboard can now contain up to three custom tabs.

Stacked bar charts

You can now configure a stacked bar chart on the dashboard and in the View Designer.

Using expressions to set field values based on a questionnaire respondent's answers

You can now enter an expression for the value of a field. For example, you can enter [$TODAY$] for the current date,[$END_USER$] for the name of the signed on user, or [$System Fields:Description$] to set the field to the value of the Description field of the object.

Enhancements to the watsonx.governance Model Risk Governance solution

This releases includes the following enhancements:

The new Model Group object type provides a way to group similar models together. For example, versions of a model that use a similar approach to solve a business problem might in a Model Group.
The new Use Case Risk Scoring calculation aggregates metrics by breach status into risk scores to give an overall view into how the underlying models of a use case are performing.
The new Discovered AI library business entity provides a default place to store any AI deployments that are not following sanctioned governance practices within an organization (also known as “shadow AI”).
The workflows, views, and dashboards were updated.

For more information, see Solution components in Governance console.

Bug fixes and security fixes

Bug fixes and security fixes were applied.

For more information, see New features in 9.0.0.4.

Automate RAG patterns with AutoAI SDK (beta)

23 September 2024

Use the AutoAI Python SDK to automate and accelerate the design and deployment of an optimized, Retrieval-augmented generation (RAG) pattern based on your data and use-case. RAG comes with many configuration parameters, including which large language model to choose, how to chunk the grounding documents, and how many documents to retrieve. AutoAI automates the full exploration and evaluation of a constrained set of configuration options and produces a set of pattern pipelines ranked by performance against the optimization metric.

Note: While this feature is in beta, there is no charge for running the experiment, and no tokens are consumed. However, calls to RAG patterns and their derivatives done after the experiment completes consume resources and incur billing charges at the standard rates.

See Automating a RAG pattern with the AutoAI SDK(Beta) for details about the feature and usage notes for coding an AutoAI RAG experiment.

Removal of Spark 3.3 runtime

23 September 2024

Support for Spark 3.3 runtime in IBM Analytics Engine will be removed by October 29, 2024 and the default version will be changed to Spark 3.4 runtime. To ensure a seamless experience and to leverage the latest features and improvements, switch to Spark 3.4.

Beginning October 29, 2024, you cannot create or run notebooks or custom environments by using Spark 3.3 runtimes. Also, you cannot create or run deployments with software specifications that are based on the Spark 3.3 runtime.

To upgrade your instance to Spark 3.4, see Replace Instance Default Runtime.
For details on available notebook environments, see Changing the environment of a notebook.
For details on deployment frameworks, see Managing frameworks and software specifications.

Week ending 20 September 2024

Inference a multimodal foundation model from the Prompt Lab

19 September 2024

You can now add an image in Prompt Lab and chat about the image by prompting a multimodal foundation model in chat mode. In addition to grounding documents, you can now upload images and ask a foundation model that supports image-to-text tasks about the visual content of the image. For more information, see Chatting with documents and images.

New llama3-llava-next-8b-hf model is available in the Dallas region

19 September 2024

You can now use the new llama3-llava-next-8b-hf multimodal foundation model on IBM watsonx.ai to help with image-to-text tasks.

Large Language and Vision Assistant (LLaVa) combines a pretrained large language model with a pretrained vision encoder for multimodal chatbot use cases. LLaVA NeXT Llama3 is trained on more diverse, high quality image and text data. For details, see Supported foundation models.

Use the watsonx.ai Node.js SDK to code generative AI applications

18 September 2024

Inference and tune foundation models in IBM watsonx as a Service programmatically by using the watsonx.ai Node.js package. For more information, see Node.js SDK.

Understand IP indemnification policies for foundation models

18 September 2024

You can now better understand the IBM intellectual property indemnification policy and see which foundation models have IP indemnity coverage in watsonx.ai. For more information, see Model types and IP indemnification.

Week ending 13 September 2024

Create batch jobs for SPSS Modeler flows in deployment spaces

10 September 2024

You can now create batch jobs for SPSS Modeler flows in deployment spaces. Flows give you the flexibility to decide which terminal nodes to run each time that you create a batch job from a flow. When you schedule batch jobs for flows, the batch job uses the data sources and output targets that you specified in your flow. The mapping for these data sources and outputs is automatic if the data sources and targets are also in your deployment space. For more information about creating batch jobs from flows, see Creating deployment jobs for SPSS Modeler flows.

For more information about flows and models in deployment spaces, see Deploying SPSS Modeler flows and models.

Week ending 6 September 2024

Bring your own foundation model to inference from watsonx.ai in the Dallas region

3 September 2024

In addition to working with foundation models that are curated by IBM, you can now upload and deploy your own foundation models. After the models are deployed and registered with watsonx.ai, create prompts that inference the custom models programmatically or from the Prompt Lab. This feature is available in the Dallas region only.

To learn more about uploading custom foundation models, see Deploying custom foundation models. For plan information and billing details for custom foundation models, see watsonx.ai Runtime plans.

Simplify complex business documents with the document text extraction API

3 September 2024

Apply the document understanding technology developed by IBM to simplify your complex business documents so that they can be processed by foundation models as part of a generative AI workflow. The document text extraction API extracts text from document structures such as images, diagrams, and tables that foundation models often cannot interpret correctly. The text extraction method of the watsonx.ai REST API is a beta feature.

For more information, see Extracting text from documents.

Granite Code foundation model modification and updates are available

3 September 2024

The granite-20b-code-instruct foundation model was modified to version 1.1.0. The latest modification is trained on a mixture of high-quality data from code and natural language domains to improve the reasoning and instruction-following capabilities of the model.

The following foundation models were updated to increase the size of the supported context window length (input + output) for prompts from 8192 to 128,000:

granite-3b-code-instruct
granite-8b-code-instruct

For more information, see Supported foundation models.

Week ending 30 August 2024

The llama-2-13b-chat and llama-2-70b-chat models are deprecated

26 August 2024

The llama-2-13b-chat and llama-2-70b-chat foundation models are deprecated and will be withdrawn on 25 September 2024. Revise any prompts that use these foundation models.

llama-2-13b-chat

Deprecation date: 26 August 2024
Withdrawal date: 25 September 2024
Alternative model: llama-3.1-8b-instruct

llama-2-70b-chat

Deprecation date: 26 August 2024
Withdrawal date: 25 September 2024
Alternative model: llama-3.1-70b-instruct

Inference requests that are submitted to the llama-2-13b-chat and llama-2-70b-chat models by using the API continue to generate output, but include a warning message about the upcoming model withdrawal. Starting on 25 September 2024, API requests for inferencing the models will not generate output.

For details about deprecation and withdrawal, see Foundation model lifecycle.

Week ending 23 August 2024

Add user groups as collaborators in projects and spaces

22 August 2024

You can now add user groups in projects and spaces if your IBM Cloud account contains IAM access groups. Your IBM Cloud account administrator can create access groups, which are then available as user groups in projects. For more information, see Working with IAM access groups.

Support ending of anomaly prediction feature for AutoAI time-series experiments

19 August 2024

The feature to predict anomalies (outliers) in AutoAI time-series model predictions, currently in beta, is deprecated and will be removed on Sep 23, 2024.. Standard AutoAI time-series experiments are still fully supported. For details, see Building a time series experiment.

Week ending 16 August 2024

New Slate embedding models from IBM are available in all regions

15 August 2024

IBM Slate embedding models provide enterprises with the ability to generate embeddings for various inputs such as queries, passages, or documents. The new slate-125m-english-rtrvr-v2 and slate-30m-english-rtrvr-v2 models show significant improvements over their v1 counterparts. If you use the slate-125m-english-rtrvr and slate-30m-english-rtrvr models today, switch to the new v2 Slate models to take advantage of the model improvements.

For more information, see Supported encoder foundation models.

Configure AI guardrails for user input and foundation model output separately in Prompt Lab

15 August 2024

Adjust the sensitivity of the AI guardrails that find and remove harmful content when you experiment with foundation model prompts in Prompt Lab. You can set different filter sensitivity levels for user input and model output text, and can save effective AI guardrails settings in prompt templates.

For more information, see Removing harmful content.

Week ending 9 August 2024

Select test data from projects for prompt template evaluations

8 August 2024

When you evaluate prompt templates in projects, you can now choose project assets to select test data for evaluations. For more information, see Evaluating prompt templates in projects.

New llama-3-1-70b-instruct model is now available on IBM watsonx.ai

7 August 2024

You can now use the latest Llama 3.1 foundation models from Meta in the 70 billion parameter size on IBM watsonx.ai.

The Llama 3.1 series of foundation models are high-performant large language models with top-tier reasoning capabilities. The models can be used for complex multilingual reasoning tasks, including text understanding, transformation, and code generation. They support English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. For details, see Supported foundation models.

Updated Q&A with RAG accelerator

6 August 2024

The Q&A with RAG accelerator 1.2 sample project includes the following improvements:

Get help with the next phase of your retrieval-augmented generation (RAG) implementation: collecting user feedback and analyzing answer quality. Includes analytics with unsupervised topic detection to show popular topics, user satisfaction with generated answers by topic, and retrieval search scores by topic.
New prompt templates that are optimized for the IBM granite-7b-lab and Meta Llama 3.1 foundation models.
Streamlined code that uses RAG utilities from the watsonx.ai Python library and targeted vector search filters to search by product, area, and more.

See Q&A with RAG accelerator.

Note: If you cannot create the sample project, try replacing the description field text.

Week ending 2 August 2024

New llama-3-1-8b-instruct model is now available on IBM watsonx.ai

1 August 2024

You can now use the latest Llama 3.1 foundation models from Meta in the 8 billion parameter size on IBM watsonx.ai.

Associate workspaces with AI use cases

1 August 2024

The flow for creating an AI use case is changed to more closely align with the AI lifecycle. After you define the essentials for an AI use case, associate workspaces to organize assets so they align with the phases of an AI solution. For example, associate a project or space for assets in the Development or Validation phases, and associate a space for assets in the Operation phase.

For details, see Associating workspaces with an AI use case.

Week ending 26 July 2024

Announcing support for Python 3.11 and R4.3 frameworks and software specifications on runtime 24.1

25 July 2024

You can now use IBM Runtime 24.1, which includes the latest data science frameworks based on Python 3.11 and R 4.3, to run Jupyter notebooks and R scripts, and train models. Starting on July 29, you can also run deployments. Update your assets and deployments to use IBM Runtime 24.1 frameworks and software specifications.

For information on the IBM Runtime 24.1 release and the included environments for Python 3.10 and R 4.2, see Notebook environments.
For details on deployment frameworks, see Managing frameworks and software specifications.

Enhanced version of Jupyter Notebook editor is now available

25 July 2024

If you're running your notebook in environments that are based on Runtime 24.1, you can use these enhancements to work with your code:

Automatically debug your code
Automatically generate a table of contents for your notebook
Toggle line numbers next to your code
Collapse cell contents and use side-by-side view for code and output, for enhanced productivity

For more information, see Jupyter notebook editor.

Natural Language Processor transformer embedding models supported with Runtime 24.1

25 July 2024

In the new Runtime 24.1 environment, you can now use natural language processing (NLP) transformer embedding models to create text embeddings that capture the meaning of a sentence or passage to help with retrieval-augmented generation tasks. For more information, see Embeddings.

New specialized NLP models are available in Runtime 24.1

25 July 2024

The following new, specialized NLP models are now included in the Runtime 24.1 environment:

A model that is able to detect and identify hateful, abusive, or profane content (HAP) in textual content. For more information, see HAP detection.
Three pre-trained models that are able to address topics related to finance, cybersecurity, and biomedicine. For more information, see Classifying text with a custom classification model.

Extract detailed insights from large collections of texts by using Key Point Summarization

25 July 2024

You can now use Key Point Summarization in notebooks to extract detailed and actionable insights from large collections of texts that represent people’s opinions (such as product reviews, survey answers, or comments on social media). The result is delivered in an organized, hierarchical way that is easy to process. For more information, see Key Point Summarization

RStudio version update

25 July 2024

To provide a consistent user experience across private and public clouds, the RStudio IDE for the IBM watsonx will be updated to RStudio Server 2024.04.1 and R 4.3.1 on July 29, 2024. The new version of RStudio provides a number of enhancements and security fixes. See the RStudio Server 2024.04.1 release notes for more information. While no major compatibility issues are anticipate, users should be aware of the version changes for some packages described in the following table below.

When launching the RStudio IDE from a project after the upgrade, reset the RStudio workspace to ensure that the library path for R 4.3.1 packages is picked up by the RStudio Server.

A new version of the Mistral Large model is now available on IBM watsonx.ai in the Dallas, Frankfurt and London regions

24 July 2024

You can now use the Mistral Large 2 foundation model from Mistral AI on IBM watsonx.ai in the Dallas, Frankfurt and London data centers.

The Mistral Large 2 model supports 11 languages and is proficient in text understanding, code generation, and advanced reasoning. For details, see Supported foundation models.

New llama-3-405b-instruct model is available in the Dallas region

23 July 2024

You can now use the llama-3-405b-instruct foundation model from Meta on IBM watsonx.ai in the Dallas data center.

The llama-3-405B-instruct (v3.1) model provides enterprises with a high-performant large language model with top-tier reasoning capabilities, and is the largest open-sourced model ever released to date. This foundation model can be used for complex multilingual reasoning tasks, including text understanding, transformation, and code generation. For details, see Supported foundation models.

The merlinite-7b model is deprecated

22 July 2024

The merlinite-7b foundation model is deprecated and will be withdrawn on 22 August 2024. Revise any prompts that use this foundation model.

Deprecation date: 22 July 2024
Withdrawal date: 22 August 2024
Alternative model: mixtral-8x7b-instruct-v01

Inference requests that are submitted to the merlinite-7b model by using the API continue to generate output, but include a warning message about the upcoming model withdrawal. Starting on 22 August 2024, API requests for inferencing the model will not generate output.

For more information about deprecation and withdrawal, see Foundation model lifecycle.

Week ending 12 July 2024

New Mistral Large model is available in the Frankfurt and Dallas regions

9 July 2024

You can now use the Mistral Large foundation model from Mistral AI on IBM watsonx.ai in the Frankfurt and Dallas data centers.

Mistral Large provides enterprises with a high-performant large language model with top-tier reasoning capabilities. This foundation model can be used for complex multilingual reasoning tasks, including text understanding, transformation, and code generation. For details, see Supported foundation models.

Week ending 5 July 2024

Connectors grouped by data source type

05 July 2024

When you create a connection, the connectors are now grouped by data source type so that the connectors are easier to find and select. For example, the MongoDB data source type includes the IBM Cloud Databases for MongoDB and the MongoDB connectors.

In addition, a new Recents category shows the six latest connectors that you used to create a connection.

For instructions, see Adding connections to data sources in a project.

Add contextual information to foundation model prompts in Prompt Lab

4 July 2024

Help a foundation model generate factual and up-to-date answers in retrieval-augmented generation (RAG) use cases by adding relevant contextual information to your prompt as grounding data. You can quickly upload relevant documents or connect to a third-party vector store with relevant data. When a new question is submitted, the question is used to query the grounding data for relevant facts. The top search results plus the original question are submitted as model input to help the foundation model incorporate relevant facts in its output.

For more information, see Grounding foundation model prompts in contextual information.

Changes to Cloud Object Storage Lite plans

1 July 2024

Starting on 1 July 2024, the Cloud Object Storage Lite plan that is automatically provisioned when you sign up for a 30 day trial of watsonx.ai and watsonx.governance expires after the trial ends. You can upgrade your Cloud Object Storage Lite instance to the Standard plan with the Free Tier option at any time during the 30 day trial.

Existing Cloud Object Storage service instances with Lite plans that you provisioned prior to 1 July 2024 will be retained until 15 December 2024. You must upgrade your Cloud Object Storage service to a Standard plan before 15 December 2024.

See Cloud Object Storage service plans.

Week ending 21 June 2024

Create detached deployments for governing prompts for external large language models (LLMs)

21 Jun 2024

A detached prompt template is a new asset for evaluating a prompt template for an LLM that is hosted by a third-party provider, such as Google Vertex AI, Azure OpenAI, or AWS Bedrock. The inferencing that generates the output for the prompt template is done on the remote model, but you can evaluate the prompt template output by using watsonx.governance metrics. You can also track the detached deployment and detached prompt template in an AI use case as part of your governance solution.

For more information, see:

Task credentials will be required for deployment job requests

19 Jun 2024

To improve security for running deployment jobs, the user requesting the job will be required to provide task credentials in the form of an API key. The requirement will be enforced starting August 15, 2024. See Adding task credentials for details on generating the API key.

Screenshot showing how to create task credentials from Profile and settings

Assess use cases for EU AI Act applicability

19 Jun 2024

By using the new EU AI Act applicability assessment, you can complete a simple questionnaire to assess your AI use cases and determine whether they are within the scope of the EU AI Act. The assessment can also help you to identify the risk category that your use cases align to: prohibited, high, limited, or minimal. For more information, see Applicability Assessment in Solution components in Governance console.

Week ending 7 June 2024

Manage risk and compliance activities with the Governance consolee (IBM OpenPages)

7 June 2024

Watsonx.governance now supports optional integration with the Governance consolee. If you have installed the Model Risk Governance module of IBM OpenPages, you can configure AI use cases to sync governance facts with the Governance console. From the Governance console, you can create use cases, view governance activities, manage tasks, and implement workflows as part of your governance and compliance processes. For more information, see:

Week ending 31 May 2024

IBM Watson Pipelines is now IBM Orchestration Pipelines

30 May 2024

The new service name reflects the capabilities for orchestrating parts of the AI lifecycle into repeatable flows.

Tag projects for easy retrieval

31 May 2024

You can now assign tags to projects to make them easier to group or retrieve. Assign tags when you create a new project or from the list of all projects. Filter the list of projects by tag to retrieve a related set of projects. For more information, see Creating a project.

Connect to a new data source: Milvus

31 May 2024

Use the Milvus connection to store and confirm the accuracy of your credentials and connection details to access a Milvus vector store. For information, see Milvus connection.

Week ending 24 May 2024

New tutorial and video

23 May 2024

Try the new tutorial to see how to evaluate a model deployment by using the functionality in Watson OpenScale in a deployment space.

Tutorial	Description	Expertise for tutorial
Evaluate a deployment in spaces	Deploy a model, configure monitors for the deployed model, and evaluate the model in a deployment space.	Configure the monitors and evaluate a model in a deployment space.

The allam-1-13b-instruct foundation model is available in the Frankfurt region

21 May 2024

The Arabic foundation model allam-1-13b-instruct from Saudi Authority for Data and Artificial Intelligence and provided by IBM is available from watsonx.ai in the Frankfurt data center. You can use the allam-1-13b-instruct foundation model for general-purpose tasks, including Q&A, summarization, classification, generation, extraction, and translation in Arabic. For more information, see Supported foundation models.

Deploy traditional and generative AI assets with the watsonx.ai Python client library

21 May 2024

The Watson Machine Learning Python client library is now part of an expanded library, the watsonx.ai Python client library. Use the watsonx.ai Python library to work with traditional machine learning and generative AI assets. The Watson Machine Learning library will persist but will not be updated with new features. For more information, see Python library.

Week ending 17 May 2024

Third-party text embedding models are available in watsonx.ai

16 May 2024

The following third-party text embedding models are now available in addition to the IBM Slate models for enhanced text matching and retrieval:

all-minilm-l12-v2
multilingual-e5-large

Submit sentences or passages to one of the supported embedding models by using the watsonx.ai Python library or REST API to convert input text into vectors to more accurately compare and retrieve similar text.

For more information about these models, see Supported encoder foundation models.

For more information about converting text, see Text embedding generation.

Week ending 10 May 2024

New Granite Code foundation models are available in the Dallas region

9 May 2024

You can now inference the following Granite Code foundation models provided by IBM from watsonx.ai:

granite-3b-code-instruct
granite-8b-code-instruct
granite-20b-code-instruct
granite-34b-code-instruct

Use the new Granite Code foundation models for programmatic coding tasks. The foundation models are fine-tuned on a combination of instruction data to enhance instruction-following capabilities including logical reasoning and problem solving.

For more information, see Supported foundation models.

InstructLab foundation models are available in watsonx.ai

7 May 2024

InstructLab is an open source initiative by Red Hat and IBM that provides a platform for augmenting the capabilities of a foundation model. The following foundation models support knowledge and skills that are contributed from InstructLab:

granite-7b-lab
granite-13-chat-v2
granite-20b-multilingual
merlinite-7b

You can explore the open source community contributions from the foundation model's taxonomy page.

For more information, see InstructLab-compatible foundation models.

Week ending 3 May 2024

Organize project assets into folders

2 May 2024

You can now create folders in your projects to organize assets. An administrator of the project must enable folders, and administrators and editors can create and manage them. Folders are in beta and are not yet supported for use in production environments. For more information, see Organizing assets with folders (beta).

The Assets tab with folders

Week ending 26 April 2024

IBM watsonx.ai is available in the London region

25 Apr 2023

Watsonx.ai is now generally available in the London data center and London can be selected as the preferred region when signing-up.

The foundation models that are fully supported in Dallas are also available for inferencing in the London data center from the Prompt Lab or by using the API. The exceptions are mt0-xxl-13b and the llama-2-70b-chat foundation model, which is superseded by the llama-3-70b-instruct foundation model that is now available.
Prompt-tune the three tunable foundation models from the Tuning Studio or by using the API.
The two IBM embedding models and the embeddings API are supported.

For more information, see Regional availability for services and features.

Start a chat in Prompt Lab directly from the home page

25 Apr 2023

Now you can start a conversation with a foundation model from the IBM watsonx.ai home page. Enter a question to send to a foundation model in chat mode or click Open Prompt Lab to choose a foundation model and model parameters before you submit model input.

Week ending 19 April 2024

New Meta Llama 3 foundation models are now available

18 Apr 2024

The following Llama 3 foundation models provided by Meta are available for inferencing from watsonx.ai:

llama-3-8b-instruct
llama-3-70b-instruct

The new Llama 3 foundation models are instruction fine-tuned language models that can support various use cases.

This latest release of Llama is trained with more tokens and applies new post-training procedures. The result is foundation models with better language comprehension, reasoning, code generation, and instruction-following capabilities.

For more information, see Supported foundation models.

Introducing IBM embedding support for enhanced text matching and retrieval

18 Apr 2024

You can now use the IBM embeddings API and IBM embedding models for transforming input text into vectors to more accurately compare and retrieve similar text.

The following IBM Slate embedding models are available:

slate.125m.english.rtrvr
slate.30m.english.rtrvr

For more information, see Text embedding generation.

For pricing details, see watsonx.ai Runtime plans.

Evaluate machine learning deployments in spaces

18 Apr 2024

Configure watsonx.governance evaluations in your deployment spaces to gain insights about your machine learning model performance. For example, evaluate a deployment for bias or monitor a deployment for drift. When you configure evaluations, you can analyze evaluation results and model transaction records directly in your spaces.

For more information, see Evaluating deployments in spaces.

A Korean-language foundation model is available in the Tokyo region

18 Apr 2024

The llama2-13b-dpo-v7 foundation model provided by Minds & Company and based on the Llama 2 foundation model from Meta is available in the Tokyo region.

The llama2-13b-dpo-v7 foundation model specializes in conversational tasks in Korean and English. You can also use the llama2-13b-dpo-v7 foundation model for general purpose tasks in the Korean language.

For more information, see Supported foundation models.

A mixtral-8x7b-instruct-v01 foundation model is available for inferencing

18 Apr 2024

The mixtral-8x7b-instruct-v01 foundation model from Mistral AI is available for inferencing from watsonx.ai. The mixtral-8x7b-instruct-v01 foundation model is a pretrained generative model that uses a sparse mixture-of-experts network to generate text more efficiently.

You can use the mixtral-8x7b-instruct-v01 model for general-purpose tasks, including classification, summarization, code generation, language translation, and more. For more information, see Supported foundation models.

The mixtral-8x7b-instruct-v01-q foundation model is deprecated and will be withdrawn on 20 June 2024. Revise any prompts that use this foundation model.

Deprecation date: 19 April 2024
Withdrawal date: 20 June 2024
Alternative model: mixtral-8x7b-instruct-v01

Inference requests that are submitted to the mixtral-8x7b-instruct-v01-q model by using the API continue to generate output, but include a warning message about the upcoming model withdrawal. Starting on 20 June 2024, API requests for inferencing the models will not generate output.

For more information about deprecation and withdrawal, see Foundation model lifecycle.

A modification to the granite-20b-multilingual foundation model is introduced

18 Apr 2024

The latest version of the granite-20b-multilingual is 1.1.0. The modification includes improvements that were gained by applying a novel AI alignment technique to the version 1.0 model. AI alignment involves using fine-tuning and reinforcement learning techniques to guide the model to return outputs that are as helpful, truthful, and transparent as possible.

For more information about this foundation model, see Supported foundation models.

Week ending 12 April 2024

Prompt-tune the granite-13b-instruct-v2 foundation model

11 Apr 2024

The Tuning Studio now supports tuning the granite-13b-instruct-v2 foundation model in addition to the flan-t5-xl-3b and llama-2-13b-chat foundation models. For more information, see Tuning a foundation model.

The experiment configuration settings for tuning the granite-13b-instruct-v2 foundation model change to apply the best default values depending on your task. The tuning evaluation guidelines help you to analyze the experiment results and adjust experiment configuration settings based on your findings. For more information, see Evaluating the results of a tuning experiment.

An Arabic-language foundation model is available in the Frankfurt region

11 Apr 2024

The jais-13b-chat foundation model provided by Inception, Mohamed bin Zayed University of Artificial Intelligence, and Cerebras Systems is available in the Frankfurt region.

The jais-13b-chat foundation model specializes in conversational tasks in Arabic and English. You can also use the jais-13b-chat foundation model for general purpose tasks in the Arabic language, including language translation between Arabic and English.

For more information, see Supported foundation models.

View the full text of a prompt in Prompt Lab

11 Apr 2024

Now you can review the full prompt text that will be submitted to the foundation model, which is useful when your prompt includes prompt variables or when you're working in structured mode or chat mode.

For more information, see Prompt Lab.

The deprecated Granite version 1 models are withdrawn

11 Apr 2024

The following foundation models are now withdrawn:

granite-13b-chat-v1
granite-13b-instruct-v1

Revise any prompts that use these foundation models to use the IBM Granite v2 foundation models. For more information about foundation model deprecation and withdrawal, see Foundation model lifecycle.

Week ending 5 April 2024

Use pivot tables to display data aggregated in Decision Optimization experiments

5 Apr 2024

You can now use pivot tables to display both input and output data aggregated in the Visualization view in Decision Optimization experiments. For more information, see Visualization widgets in Decision Optimization experiments.

New watsonx.ai tutorial and video

04 Apr 2024

Try the new tutorial to see how to use watsonx.ai in an end-to-end use case from data preparation through prompt engineering.

Tutorial	Description	Expertise for tutorial
Try the watsonx.ai end-to-end use case	Follow a use case from data preparation through prompt engineering.	Use various tools, such as notebooks and Prompt Lab.

Week ending 15 March 2024

The watsonx.ai API is available

14 Mar 2024

The watsonx.ai API is generally available. Use the watsonx.ai API to work with foundation models programmatically. For more information, see the API reference.

The API version is 2024-03-14.

You can continue to use the Python library that is available for working with foundation models from a notebook. For more information, see Python library.

New foundation models are available in Dallas, Frankfurt, and Tokyo

14 Mar 2024

The following foundation models are now available for inferencing from watsonx.ai:

granite-20b-multilingual: A foundation model from the IBM Granite family that you can use for various generative tasks in English, German, Spanish, French, and Portuguese.
codellama-34b-instruct-hf: A programmatic code generation model from Code Llama that is based on Llama 2 from Meta. You can use codellama-34b-instruct-hf to create prompts for generating code based on natural language inputs, and for completing and debugging code.

For more information, see Supported foundation models.

Week ending 8 March 2024

The Tuning Studio is available in Frankfurt

7 Mar 2024

The Tuning Studio is now available to users of paid plans in the Frankfurt region. Tuning Studio helps you to guide a foundation model to return useful output. You can tune both the flan-t5-xl-3b and llama-2-70b-chat foundation models when you use the Tuning Studio in Frankfurt.

For more information, see Tuning Studio.

Prompt-tune the llama-2-13b-chat foundation model in the Tokyo region

7 Mar 2024

The Tuning Studio now supports tuning the llama-2-13b-chat foundation model in the Tokyo region. First, engineer prompts for the larger llama-2-70b-chat model in the Prompt Lab to find effective prompt inputs for your use case. Then tune the smaller version of the Llama 2 model to generate comparable, if not better outputs with zero-shot prompts.

For more information, see Tuning Studio.

Lower price for Mixtral8x7b model

5 Mar 2024

The foundation model mixtral-8x7b-instruct-v01-q is reclassified from Class 2: $0.0018/Resource Unit to Class 1: $0.0006/Resource Unit, making it more cost effective to run inferencing tasks against this model. The reclassification applies to all regions where mixtral-8x7b-instruct-v01-q is available.

For more information, see Supported foundation models.

For pricing details, see watsonx.ai Runtime plans.

AI risk atlas is updated and enhanced

5 Mar 2024

You can now find the following new and enhanced content in the AI risk atlas:

A new category of non-technical risks that spans governance, legal compliance, and societal impact risks
New examples for risks
Clearer definitions of risks

See AI risk atlas.

New use cases for watsonx

5 Mar 2024

The watsonx uses cases are available to help you see how you can use our products, services, and tools:

watsonx.ai use case: This use case covers how you can transform your business processes with AI-driven solutions by integrating machine learning and generative AI into your operational framework.
watsonx.governance use case: This use case covers how you can erive responsible, transparent, and explainable AI workflows with an integrated system for tracking, monitoring, and retraining AI models.

See watsonx use cases.

Week ending 1 March 2024

Chat mode is available in Prompt Lab

29 Feb 2024

Chat mode in Prompt Lab is a simple chat interface that makes it easier to experiment with foundation models. Chat mode augments the already available structured and freeform modes that are useful when building few- or many-shot prompts for tasks such as extraction, summarization, and classification. Use Chat mode to simulate question-answering or conversational interactions for chatbot and virtual assistant use cases.

For more information, see Prompt Lab.

A Japanese-language Granite model is available in the Tokyo region

29 Feb 2024

The granite-8b-japanese foundation model provided by IBM is available from watsonx.ai in the Tokyo region. The granite-8b-japanese foundation model is based on the IBM Granite Instruct model and is trained to understand and generate Japanese text.

You can use the granite-8b-japanese foundation model for general purpose tasks in the Japanese language, such as classification, extraction, question-answering, and for language translation between Japanese and English.

For more information, see Supported foundation models.

Week ending 23 February 2024

Lower price for Granite-13b models

21 Feb 2024

Granite-13b models are reclassified from Class 2: $0.0018/Resource Unit to Class 1: $0.0006/Resource Unit, making it more cost effective to run inferencing tasks against these models. The reclassification applies to the following models in all regions where they are available:

granite-13b-chat-v2
granite-13b-chat-v1
granite-13b-instruct-v2
granite-13b-instruct-v1

For more information on these models, see Supported foundation models.

For pricing details, see watsonx.ai Runtime plans.

Week ending 16 February 2024

New shortcut to start working on common tasks

15 Feb 2024

You can now start a common task in your project by clicking on a tile in the Start working section in the Overview tab. Use these shortcuts to start adding collaborators and data, and to experiment with and build models. Click View all to jump to a selection of tools.

New mixtral-8x7b-instruct-v01-q foundation model for general-purpose tasks

15 Feb 2024

The mixtral-8x7b-instruct-v01-q foundation model provided by Mistral AI and quantized by IBM is available from watsonx.ai. The mixtral-8x7b-instruct-v01-q foundation model is a quantized version of the Mixtral 8x7B Instruct foundation model from Mistral AI.

You can use this new model for general-purpose tasks, including classification, summarization, code generation, language translation, and more. For more information, see Supported foundation models.

The following models are deprecated and will be withdrawn soon. Revise any prompts that use these foundation models to use another foundation model, such as mixtral-8x7b-instruct-v01-q.

Deprecated foundation models
Deprecated model	Deprecation date	Withdrawal date	Alternative model
gpt-neox-20b	15 February 2024	21 March 2024	mixtral-8x7b-instruct-v01-q
mpt-7b-instruct2	15 February 2024	21 March 2024	mixtral-8x7b-instruct-v01-q
starcoder-15.5b	15 February 2024	11 April 2024	mixtral-8x7b-instruct-v01-q

Inference requests that are submitted to these models by using the API continue to generate output, but include a warning message about the upcoming model withdrawal. When the withdrawal date is reached, API requests for inferencing the models will not generate output.

For more information about deprecation and withdrawal, see Foundation model lifecycle.

A modification to the granite-13b-chat-v2 foundation model is available

15 Feb 2024

The latest version of the granite-13b-chat-v2 is 2.1.0. The modification includes improvements that were gained by applying a novel AI alignment technique to the version 2.0.0 model. AI alignment involves using fine-tuning and reinforcement learning techniques to guide the model to return outputs that are as helpful, truthful, and transparent as possible. For more information, see the What is AI alignment? blog post from IBM Research.

New watsonx tutorial and video

15 Feb 2024

Try the new watsonx.governance tutorial to help you learn how to evaluate a machine learning model for fairness, accuracy, drift, and explainability with Watson OpenScale.

New tutorials
Tutorial	Description	Expertise for tutorial
Evaluate a machine learning model	Deploy a model, configure monitors for the deployed model, and evaluate the model.	Run a notebook to configure the models and use Watson OpenScale to evaluate.

Week ending 09 February 2024

IBM Cloud Data Engine connection is deprecated

8 Feb 2022

The IBM Cloud Data Engine connection is deprecated and will be discontinued in a future release. See Deprecation of Data Engine for important dates and details.

New Spark 3.4 environment for running Data Refinery flow jobs

9 Feb 2024

When you select an environment for a Data Refinery flow job, you can now select Default Spark 3.4 & R 4.2, which includes enhancements from Spark.

Data Refinery Spark environments

The Default Spark 3.3 & R 4.2 environment is deprecated and will be removed in a future update.

Update your Data Refinery flow jobs to use the new Default Spark 3.4 & R 4.2 environment. For details, see Compute resource options for Data Refinery in projects.

Week ending 2 February 2024

Samples collection renamed to Resource hub

2 Feb 2024

The Samples collection is renamed to Resource hub to better reflect the content. The Resource hub contains foundation models and sample projects, data sets, and notebooks. See Resource hub.

IBM Cloud Databases for DataStax connection is discontinued

2 Feb 2024

The IBM Cloud Databases for DataStax connection has been removed from IBM watsonx.ai.

Dremio connection requires updates

2 Feb 2024

Previously the Dremio connection used a JDBC driver. Now the connection uses a driver based on Arrow Flight.

Important: Update the connection properties. Different changes apply to a connection for a Dremio Software (on-prem) instance or a Dremio Cloud instance.

Dremio Software: Update the port number.

The new default port number that is used by Flight is 32010. You can confirm the port number in the dremio.conf file. See Configuring via dremio.conf for information.

Additionally, Dremio no longer supports connections with IBM Cloud Satellite.

Dremio Cloud: Update the authentication method and hostname.

Log into Dremio and generate a personal access token. For instructions see Personal Access Tokens.
In IBM watsonx in the Create connection: Dremio form, change the authentication type to Personal Access Token and add the token information. (The Username and password authentication can no longer be used to connect to a Dremio Cloud instance.)
Select Port is SSL-enabled.

If you use the default hostname for a Dremio Cloud instance, you need to change it:

Change sql.dremio.cloud to data.dremio.cloud
Change sql.eu.dremio.cloud to data.eu.dremio.cloud

Prompt-tune the llama-2-13b-chat foundation model

1 Feb 2024

The Tuning Studio now supports tuning the llama-2-13b-chat foundation model. First, engineer prompts for the larger llama-2-70b-chat model in the Prompt Lab to find effective prompt inputs for your use case. Then tune the smaller version of the Llama 2 model to generate comparable, if not better outputs with zero-shot prompts. The llama-2-13b-model is available for prompt tuning in the Dallas region. For more information, see Tuning Studio.

Week ending 26 January 2024

AutoAI supports ordered data for all experiments

25 Jan 2024

You can now specify ordered data for all AutoAI experiments rather than just time series experiments. Specify if your training data is ordered sequentially, according to a row index. When input data is sequential, model performance is evaluated on newest records instead of a random sampling, and holdout data uses the last n records of the set rather than n random records. Sequential data is required for time series experiments but optional for classification and regression experiments.

Q&A with RAG accelerator

26 Jan 2024

You can now implement a question and answer solution that uses retrieval augmented generation by importing a sample project. The sample project contains notebooks and other assets that convert documents from HTML or PDF to plain text, import document segments into an Elasticsearch vector index, deploy a Python function that queries the vector index, retrieve top N results, run LLM inference to generate an answer to the question, and check the answer for hallucinations.

Try the Q&A with RAG accelerator.

Set to dark theme

25 Jan 2024

You can now set your watsonx user interface to dark theme. Click your avatar and select Profile and settings to open your account profile. Then, set the Dark theme switch to on. Dark theme is not supported in RStudio and Jupyter notebooks. For information on managing your profile, see Managing your settings.

IBM watsonx.ai is available in the Tokyo region

25 Jan 2024

Watsonx.ai is now generally available in the Tokyo data center and can be selected as the preferred region when signing-up. The Prompt Lab and foundation model inferencing are supported in the Tokyo region for these models:

elyza-japanese-llama-2-7b-instruct
flan-t5-xl-3b
flan-t5-xxl-11b
flan-ul2-20b
granite-13b-chat-v2
granite-13b-instruct-v2
llama-2-70b-chat
llama-2-13b-chat

Also available from the Tokyo region:

Prompt tuning the flan-t5-xl-3b foundation model with the Tuning Studio
Generating tabular data with the Synthetic Data Generator to use for training models

For more information on the supported models, see Supported foundation models available with watsonx.ai.

A Japanese-language Llama 2 model is available in the Tokyo region

25 Jan 2024

The elyza-japanese-llama-2-7b-instruct foundation model provided by ELYZA, Inc is available from watsonx.ai instances in the Tokyo data center. The elyza-japanese-llama-2-7b-instruct model is a version of the Llama 2 model from Meta that was trained to understand and generate Japanese text.

You can use this new model for general purpose tasks. It works well for Japanese-language classification and extraction and for translation between Japanese and English.

Week ending 12 January 2024

Support for IBM Runtime 22.2 deprecated in watsonx.ai Runtime (formerly Watson Machine Learning)

11 Jan 2024

IBM Runtime 22.2 is deprecated and will be removed on 11 April 2024. Beginning 7 March 2024, you cannot create notebooks or custom environments by using the 22.2 runtimes. Also, you cannot train new models with software specifications that are based on the 22.2 runtime. Update your assets and deployments to use IBM Runtime 23.1 before 7 March 2024.

To learn more about migrating an asset to a supported framework and software specification, see Managing outdated software specifications or frameworks.
To learn more about the notebook environment, see Compute resource options for the notebook editor in projects.
To learn more about changing your environment, see Changing the environment of a notebook.

IBM Granite v1 foundation models are deprecated

11 Jan 2024

The IBM Granite 13 billion-parameter v1 foundation models are deprecated and will be withdrawn on 11 April 2024. If you are using version 1 of the models, switch to using version 2 of the models instead.

Deprecated IBM foundation models
Deprecated model	Deprecation date	Withdrawal date	Alternative model
granite-13b-chat-v1	11 January 2024	11 April 2024	granite-13b-chat-v2
granite-13b-instruct-v1	11 January 2024	11 April 2024	granite-13b-instruct-v2

Inference requests that are submitted to the version 1 models by using the API continue to generate output, but include a warning message about the upcoming model withdrawal. Starting on 11 April 2024, API requests for inferencing the models will not generate output.

For more information about IBM Granite foundation models, see Foundation models built by IBM. For more information about deprecation and withdrawal, see Foundation model lifecycle.