Known issues and limitations

List of machine learning issues
- Error with assets using Watson Machine Learning in projects specifying Cloud Object Storage with Key Protect enabled.
- Auto AI
- Federate Learning
- Watson Pipelines
List of SPSS Modeler issues
- Unable to save model to project specifying Cloud Object Storage with Key Protect enabled.
List of notebooks issues
- Unable to save model to project specifying Cloud Object Storage with Key Protect enabled.

IBM Knowledge Catalog

If you use the IBM Knowledge Catalog, you might encounter these known issues and restrictions when you use catalogs.

Cannot use masked assets in Data Refinery

For any masked assets, the Data Refinery jobs fail. If you have access to the initial data assets before masking, the workaround is to use Data Refinery with unmasked assets.

Masked data is not supported in data visualizations

Masked data is not supported in data visualizations. If you attempt to work with masked data while generating a chart in the Visualizations tab of a data asset in a project the following error message is received: Bad Request: Failed to retrieve data from server. Masked data is not supported.

Data is not masked in some project tools

When you add a connected data asset that contains masked columns from a catalog to a project, the columns remain masked when you view the data and when you refine the data in the Data Refinery tool. However, other tools in projects do not preserve masking when they access data through a connection. For example, when you load connected data in a Notebook, a DataStage flow, a dashboard, or other project tools, you access the data through a direct connection and bypass masking.

Predefined governance artifacts might not be available

If you don't see any predefined classifications or data classes, reinitialize your tenant by using the following API call:

curl -X POST "https://api.dataplatform.cloud.ibm.com/v3/glossary_terms/admin/initialize_content" -H "Authorization: Bearer $BEARER_TOKEN" -k

Add collaborators with lowercase email addresses

When you add collaborators to the catalog, enter email addresses with all lowercase letters. Mixed-case email addresses are not supported.

Object Storage connection restrictions

When you look at a Cloud Object Storage (S3 API) or Cloudant connection, the folder itself is listed as a child asset.

Multiple concurrent connection operations might fail

An error might be encountered when multiple users are running connection operations concurrently. The error message can vary.

Can't enable data protection rule enforcement after catalog creation

You cannot enable the enforcement of data protection rules after you create a catalog. To apply data protection rules to the assets in a catalog, you must enable enforcement during catalog creation.

Assets are blocked if evaluation fails

The following restrictions apply to data assets in a catalog with policies enforced: File-based data assets that have a header can't have duplicate column names, a period (.), or single quotation mark (') in a column name.

If evaluation fails, the asset is blocked to all users except the asset owner. All other users see an error message that the data asset cannot be viewed because evaluation failed and the asset is blocked.

Only the data class filter in metadata enrichment results is case-sensitive

When you filter metadata enrichment results on the Column tab, only the Data class entries are case-sensitive. The entries in the Business terms, Schemas, and Assets filters are all lowercase regardless of the actual casing of the value.

Enrichment details for an asset might not reflect the settings applied on latest enrichment run

After you edit the enrichment options for a metadata enrichment that was run at least once, the asset details might show the updated options instead of the options applied in the latest enrichment run.

Can't access individual pages in a metadata enrichment asset directly

If the number of assets or columns in a metadata enrichment asset spans several pages, you can't go to a specific page directly. The page number drop-down list is disabled. Use the Next page and Previous page buttons instead.

In some cases, you might not see the full log of a metadata enrichment job run in the UI

If the list of errors in a metadata enrichment run is exceptionally long, only part of the job log might be displayed in the UI.

Workaround: Download the entire log and analyze it in an external editor.

Schema information might be missing when you filter enrichment results

When you filter assets or columns in the enrichment results on source information, schema information might not be available.

Workaround: Rerun the enrichment job and apply the Source filter again.

Issues with search on the Assets tab of a metadata enrichment asset

When you search for an asset on the Assets tab of a metadata enrichment asset, no results might be returned. Consider these limitations:

Search is case sensitive.
The result contains only records that match the exact search phrase or start with the phrase.

Rules run on columns of type time in data assets from Amazon Redshift data source do not return proper results

For data assets from Amazon Redshift data sources, columns of type time are imported with type timestamp. You can't apply time-specific data quality rules to such columns.

Writing metadata enrichment output to an earlier version of Apache Hive than 3.0.0

If you want to write data quality output generated by metadata enrichment to an Apache Hive database at an earlier software version than 3.0.0, set the following configuration parameters in your Apache Hive Server:

set hive.support.concurrency=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.enforce.bucketing=true;   # (not required for version 2)

set hive.compactor.initiator.on=true;
set hive.compactor.cleaner.on=true;   # might not be available depending on the version
set hive.compactor.worker.threads=1;

For more information, see Hive Transactions.

Masking flow

If you use Masking flow, you might encounter these known issues and restrictions when you are privatizing data.

Cannot use masked assets in Data Refinery

For more information, see Cannot use masked assets in Data Refinery.

Masking flow jobs might fail

During a masking flow job, Spark might attempt to read all of a data source into memory. Errors might occur when there isn't enough memory to support the job. The largest volume of data that can fit into the largest deployed Spark processing node is approximately 12GBs.

Notebook issues

You might encounter some of these issues when getting started with and using notebooks.

Duplicating a notebook doesn't create a unique name in the new projects UI

When you duplicate a notebook in the new projects UI, the duplicate notebook is not created with a unique name.

Can't create assets in older accounts

If you're working in an instance of Watson Studio that was activated before November, 2017, you might not be able to create operational assets, like notebooks. If the Create button stays gray and disabled, you must add the Watson Studio service to your account from the Services catalog.

500 internal server error received when launching Watson Studio

Rarely, you may receive an HTTP internal server error (500) when launching Watson Studio. This might be caused by an expired cookie stored for the browser. To confirm the error was caused by a stale cookie, try launching Watson Studio in a private browsing session (incognito) or by using a different browser. If you can successfully launch in the new browser, the error was caused by an expired cookie. You have a choice of resolutions:

Exit the browser application completely to reset the cookie. You must close and restart the application, not just close the browser window. Restart the browser application and launch Watson Studio to reset the session cookie.
Clear the IBM cookies from the browsing data and launch Watson Studio. Look in the browsing data or security options in the browser to clear cookies. Note that clearing all IBM cookies may affect other IBM applications.

If the 500 error persists after performing one of these resolutions, check the status page for IBM Cloud incidents affecting Watson Studio. Additionally, you may open a support case at the IBM Cloud support portal.

Failure to export a notebook to HTML in the Jupyter Notebook editor

When you are working with a Jupyter Notebook created in a tool other than Watson Studio, you might not be able to export the notebook to HTML. This issue occurs when the cell output is exposed.

Workaround

In the Jupyter Notebook UI, go to Edit and click Edit Notebook Metadata.

Remove the following metadata:

"widgets": {
   "state": {},
   "version": "1.1.2"
}

Click Edit.
Save the notebook.

Manual installation of some tensor libraries is not supported

Some tensor flow libraries are preinstalled, but if you try to install additional tensor flow libraries yourself, you get an error.

Connection to notebook kernel is taking longer than expected after running a code cell

If you try to reconnect to the kernel and immediately run a code cell (or if the kernel reconnection happened during code execution), the notebook doesn't reconnect to the kernel and no output is displayed for the code cell. You need to manually reconnect to the kernel by clicking Kernel > Reconnect. When the kernel is ready, you can try running the code cell again.

Using the predefined sqlContext object in multiple notebooks causes an error

You might receive an Apache Spark error if you use the predefined sqlContext object in multiple notebooks. Create a new sqlContext object for each notebook. See this Stack Overflow explanation.

Connection failed message

If your kernel stops, your notebook is no longer automatically saved. To save it, click File > Save manually, and you should get a Notebook saved message in the kernel information area, which appears before the Spark version. If you get a message that the kernel failed, to reconnect your notebook to the kernel click Kernel > Reconnect. If nothing you do restarts the kernel and you can't save the notebook, you can download it to save your changes by clicking File > Download as > Notebook (.ipynb). Then you need to create a new notebook based on your downloaded notebook file.

Hyperlinks to notebook sections don't work in preview mode

If your notebook contains sections that you link to from an introductory section at the top of the notebook for example, the links to these sections will not work if the notebook was opened in view-only mode in Firefox. However, if you open the notebook in edit mode, these links will work.

Can't connect to notebook kernel

If you try to run a notebook and you see the message Connecting to Kernel, followed by Connection failed. Reconnecting and finally by a connection failed error message, the reason might be that your firewall is blocking the notebook from running.

If Watson Studio is installed behind a firewall, you must add the WebSocket connection wss://dataplatform.cloud.ibm.com to the firewall settings. Enabling this WebSocket connection is required when you're using notebooks and RStudio.

Insufficient resources available error when opening or editing a notebook

If you see the following message when opening or editing a notebook, the environment runtime associated with your notebook has resource issues:

Insufficient resources available
A runtime instance with the requested configuration can't be started at this time because the required hardware resources aren't available.
Try again later or adjust the requested sizes.

To find the cause, try checking the status page for IBM Cloud incidents affecting Watson Studio. Additionally, you can open a support case at the IBM Cloud Support portal.

Machine learning issues

You might encounter some of these issues when working with machine learning tools.

Region requirements

You can only associate a Watson Machine Learning service instance with your project when the Watson Machine Learning service instance and the Watson Studio instance are located in the same region.

Accessing links if you create a service instance while associating a service with a project

While you are associating a Watson Machine Learning service to a project, you have the option of creating a new service instance. If you choose to create a new service, the links on the service page might not work. To access the service terms, APIs, and documentation, right click the links to open them in new windows.

Federated Learning assets cannot be searched in All assets, search results, or filter results in the new projects UI

You cannot search Federated Learning assets from the All assets view, the search results, or the filter results of your project.

Workaround: Click the Federated Learning asset to open the tool.

Deployment issues

A deployment that is inactive (no scores) for a set time (24 hours for the free plan or 120 hours for a paid plan) is automatically hibernated. When a new scoring request is submitted, the deployment is reactivated and the score request is served. Expect a brief delay of 1 to 60 seconds for the first score request after activation, depending on the model framework.
For some frameworks, such as SPSS modeler, the first score request for a deployed model after hibernation might result in a 504 error. If this happens, submit the request again; subsequent requests should succeed.

Previewing masked data assets is blocked in deployment space**

A data asset preview may fail with this message: This asset contains masked data and is not supported for preview in the Deployment Space

Deployment spaces currently don't support masking data so the preview for masked assets has been blocked to prevent data leaks.

Batch deployment jobs that use large inline payload might get stuck in starting or running state

If you provide a large asynchronous payload for your inline batch deployment, it can result in the runtime manager process to go out of heap memory.

In the following example, 92 MB of payload was passed inline to the batch deployment which resulted in the heap to go out of memory.

Uncaught error from thread [scoring-runtime-manager-akka.scoring-jobs-dispatcher-35] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[scoring-runtime-manager]
java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
	at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:172)
	at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:538)
	at java.base/java.lang.StringBuilder.append(StringBuilder.java:174)
   ...

This could result in concurrent jobs getting stuck in starting or running state. The starting state can only be cleared once the deployment is deleted and a new deployement is created. The running state can be cleared without deleting the deployment.

As a workaround, use data references instead of inline for huge payloads that are provided to batch deployments.

Watson Machine Learning limitations

AutoAI known limitations

Currently, AutoAI experiments do not support double-byte character sets. AutoAI only supports CSV files with ASCII characters. Users must convert any non-ASCII characters in the file name or content, and provide input data as a CSV as defined in this CSV standard.
To interact programmatically with an AutoAI model, use the REST API instead of the Python client. The APIs for the Python client required to support AutoAI are not generally available at this time.

Data module not found in IBM Federated Learning

The data handler for IBM Federated Learning is trying to extract a data module from the FL library but is unable to find it. You might see the following error message:

ModuleNotFoundError: No module named 'ibmfl.util.datasets'

The issue possibly results from using an outdated DataHandler. Please review and update your DataHandler to conform to the latest spec. Here is the link to the most recent MNIST data handler or ensure your sample versions are up-to-date.

Setting environment variables in a conda yaml file does not work for deployments

Setting environment variables in a conda yaml file does not work for deployments. This means that you cannot override existing environment variables, for example LD_LIBRARY_PATH, when deploying assets in Watson Machine Learning.

As a workaround, if you're using a Python function, consider setting default parameters. For details, see Deploying Python functions.

Cognos Dashboard Embedded issues

You might encounter some of these issues when working with a Cognos Dashboard Embedded.

CSV files containing duplicate column names are not supported

Cognos Dashboard Embedded does not support CSV files that contain duplicate column names. Duplicates are case-insensitive. For example, BRANCH_NAME, branch_name, and Branch_Name are considered duplicate column names.

Cognos dashboards can only use data connections created with username and password credentials

Cognos Dashboard Embedded requires that database connections and connected data assets added as data sources to a dashboard must include username and password credentials.

If these credentials are not specified in the connection and a token or API key is used instead, then Cognos Dashboard Embedded cannot use that connection or connected data asset as a data source.

Incorrect data type shown for refined data assets

After you import a CSV file, if you click on the imported file in the data asset overview page, types of some columns might not show up correctly. For example, a dataset of a company report with a column called Revenue that contains the revenue of the company might show up as type String, instead of a number-oriented data type that is more logical.

Unsupported special characters in CSV files

The source CSV file name can contain non-alphanumeric characters. However, the CSV file name can't contain the special characters / : & < . \ ". If the file name contains these characters, they are removed from the table name.

Important: Table column names in the source CSV file can't contain any of the unsupported special characters. Those characters can't be removed because the name in the data module must match the name of the column in the source file. In this case, remove the special characters in your column names to enable using your data in a dashboard.

String values in CSV files are limited to 128 characters

String values in a column in your source CSV file can be only 128 characters long. If your CSV file has string columns with values that are longer, an error message is displayed.

Date format limitations in CSV files

There are date format limitations for CSV files used in visualizations. For details, see Resolving problems when using data from CSV files in Cognos Dashboard Embedded.

Can't replace a data table in a visualization

When you add a visualization to a dashboard, you cannot add a data table to the visualization if you previously added (and then removed) data fields from another data table. This restriction applies to Db2, CSV tables, and other data sources.

Cognos Analytics features that are not supported

The following functionality from IBM Cognos Analytics is not supported in dashboards:

Data grouping
Custom color palettes
Custom visualizations
Assistant
Forecasting
Insights in visualization
Jupyter notebook visualization
Advanced data analytics

Watson OpenScale issues

You might encounter the following issues in Watson OpenScale:

Drift configuration is started but never finishes

Drift configuration is started but never finishes and continues to show the spinner icon. If you see the spinner run for more than 10 minutes, it is possible that the system is left in an inconsistent state. There is a workaround to this behavior: Edit the drift configuration. Then, save it. The system might come out of this state and complete configuration. If drift reconfiguration does not rectify the situation, contact IBM Support.

SPSS Modeler issues

You might encounter some of these issues when working in SPSS Modeler.

SPSS Modeler runtime restrictions

Watson Studio does not include SPSS functionality in Peru, Ecuador, Colombia and Venezuela.

Merge node and unicode characters

The Merge node treats the following very similar Japanese characters as the same character.

Connection issues

You might encounter this issue when working with connections.

Cloudera Impala connection does not work with LDAP authentication

If you create a connection to a Cloudera Impala data source and the Cloudera Impala server is set up for LDAP authentication, the username and password authentication method in Cloud Pak for Data as a Service will not work.

Workaround: Disable the Enable LDAP Authentication option on the Impala server. See Configuring LDAP Authentication in the Cloudera documentation.

Watson Pipelines known issues

The issues pertain to Watson Pipelines.

Nesting loops more than 2 levels can result in pipeline error

Nesting loops more than 2 levels can result in an error when you run the pipeline, such as Error retrieving the run. Reviewing the logs can show an error such as text in text not resolved: neither pipeline_input nor node_output. If you are looping with output from a Bash script, the log might list an error like this: PipelineLoop can't be run; it has an invalid spec: non-existent variable in $(params.run-bash-script-standard-output). To resolve the problem, do not nest loops more than 2 levels.

Asset browser does not always reflect count for total numbers of asset type

When selecting an asset from the asset browser, such as choosing a source for a Copy node, you see that some of the assets list the total number of that asset type available, but notebooks do not. That is a current limitation.

Cannot delete pipeline versions

Currently, you cannot delete saved versions of pipelines that you no longer need.

Deleting an AutoAI experiment fails under some conditions

Using a Delete AutoAI experiment node to delete an AutoAI experiment that was created from the Projects UI does not delete the AutoAI asset. However, the rest of the flow can complete successfully.

Cache appears enabled but is not enabled

If the Copy assets Pipelines node's Copy mode is set to Overwrite, cache is displayed as enabled but remains disabled.

Watson Pipelines limitations

These limitations apply to Watson Pipelines.

Single pipeline limits
Limitations by configuration size
Input and output size limits
Batch input limited to data assets

Single pipeline limits

These limitation apply to a single pipeline, regardless of configuration.

Any single pipeline cannot contain more than 120 standard nodes
Any pipeline with a loop cannot contain more than 600 nodes across all iterations (for example, 60 iterations - 10 nodes each)

Limitations by configuration size

Small configuration

A SMALL configuration supports 600 standard nodes (across all active pipelines) or 300 nodes run in a loop. For example:

30 standard pipelines with 20 nodes run in parallel = 600 standard nodes
3 pipelines containing a loop with 10 iterations and 10 nodes in each iteration = 300 nodes in a loop

Medium configuration

A MEDIUM configuration supports 1200 standard nodes (across all active pipelines) or 600 nodes run in a loop. For example:

30 standard pipelines with 40 nodes run in parallel = 1200 standard nodes
6 pipelines containing a loop with 10 iterations and 10 nodes in each iteration = 600 nodes in a loop

Large configuration

A LARGE configuration supports 4800 standard nodes (across all active pipelines) or 2400 nodes run in a loop. For example:

80 standard pipelines with 60 nodes run in parallel = 4800 standard nodes
24 pipelines containing a loop with 10 iterations and 10 nodes in each iteration = 2400 nodes in a loop

Input and output size limits

Input and output values, which include pipeline parameters, user variables, and generic node inputs and outputs, cannot exceed 10 KB of data.

Batch input limited to data assets

Currently, input for batch deployment jobs is limited to data assets. This means that certain types of deployments, which require JSON input or multiple files as input, are not supported. For example, SPSS models and Decision Optimization solutions that require multiple files as input are not supported.

Issues with Cloud Object Storage

These issue apply to working with Cloud Object Storage.

Issues with Cloud Object Storage when Key Protect is enabled

Key Protect in conjunction with Cloud Object Storage is not supported for working with Watson Machine Learning assets. If you are using Key Protect, you might encounter these issues when you are working with assets in Watson Studio.

Training or saving these Watson Machine Learning assets might fail:
- Auto AI
- Federated Learning
- Watson Pipelines
You might be unable to save an SPSS model or a notebook model to a project

Issues with watsonx.governance

Delay showing prompt template deployment data in a factsheet

When a deployment is created for a prompt template, the facts for the deployment are not added to factsheet immediately. You must first evaluate the deployment or view the lifecycle tracking page to add the facts to the factsheet.

Redundant attachment links in factsheet

A factsheet tracks all of the events for an asset over all phases of the lifecycle. Attachments show up in each stage, creating some redundancy in the factsheet.

Known issues and limitations

List of IBM Knowledge Catalog issues

List of Masking flow issues

List of Data Virtualization issues

List of notebooks issues

List of machine learning issues

List of machine learning limitations

List of Cognos Dashboard Embedded issues

List of Watson OpenScale issues

List of SPSS Modeler issues

List of connection issues

Issues with Cloud Object Storage

IBM Knowledge Catalog

Cannot use masked assets in Data Refinery

Masked data is not supported in data visualizations

Data is not masked in some project tools

Predefined governance artifacts might not be available

Add collaborators with lowercase email addresses

Object Storage connection restrictions

Multiple concurrent connection operations might fail

Can't enable data protection rule enforcement after catalog creation

Assets are blocked if evaluation fails

Only the data class filter in metadata enrichment results is case-sensitive

Enrichment details for an asset might not reflect the settings applied on latest enrichment run

Can't access individual pages in a metadata enrichment asset directly

In some cases, you might not see the full log of a metadata enrichment job run in the UI

Schema information might be missing when you filter enrichment results

Issues with search on the Assets tab of a metadata enrichment asset

Rules run on columns of type time in data assets from Amazon Redshift data source do not return proper results

Writing metadata enrichment output to an earlier version of Apache Hive than 3.0.0

Masking flow

Cannot use masked assets in Data Refinery

Masking flow jobs might fail

Notebook issues

Duplicating a notebook doesn't create a unique name in the new projects UI

Can't create assets in older accounts

500 internal server error received when launching Watson Studio

Error during login

Failure to export a notebook to HTML in the Jupyter Notebook editor

Manual installation of some tensor libraries is not supported

Connection to notebook kernel is taking longer than expected after running a code cell

Using the predefined sqlContext object in multiple notebooks causes an error

Connection failed message

Hyperlinks to notebook sections don't work in preview mode

Can't connect to notebook kernel

Insufficient resources available error when opening or editing a notebook

Machine learning issues

Region requirements

Accessing links if you create a service instance while associating a service with a project

Federated Learning assets cannot be searched in All assets, search results, or filter results in the new projects UI

Deployment issues

Previewing masked data assets is blocked in deployment space**

Watson Machine Learning limitations

AutoAI known limitations

Data module not found in IBM Federated Learning

Setting environment variables in a conda yaml file does not work for deployments

Cognos Dashboard Embedded issues

CSV files containing duplicate column names are not supported

Cognos dashboards can only use data connections created with username and password credentials

Incorrect data type shown for refined data assets

Unsupported special characters in CSV files

String values in CSV files are limited to 128 characters

Date format limitations in CSV files

Can't replace a data table in a visualization

Cognos Analytics features that are not supported

Watson OpenScale issues

Drift configuration is started but never finishes

SPSS Modeler issues

SPSS Modeler runtime restrictions

Merge node and unicode characters

Connection issues

Cloudera Impala connection does not work with LDAP authentication

Watson Pipelines known issues

Nesting loops more than 2 levels can result in pipeline error

Asset browser does not always reflect count for total numbers of asset type

Cannot delete pipeline versions

Deleting an AutoAI experiment fails under some conditions

Cache appears enabled but is not enabled

Watson Pipelines limitations

Single pipeline limits