Known issues and limitations
The following limitations and known issues apply to Cloud Pak for Data as a Service.
- Regional limitations
- Watson Knowledge Catalog
- Masking flow
- Data Refinery
- Watson Query
- Watson Studio
- Watson Machine Learning
- Cognos Dashboard Embedded
- Watson OpenScale
- SPSS Modeler
- Watson Pipelines
- Cloud Object Storage issues
List of Watson Knowledge Catalog issues
- Synchronize the data policy service (DPS) category caches
- Masked data is not supported in data visualizations
- Data is not masked in some project tools
- Predefined governance artifacts might not be available
- Add collaborators with lowercase email addresses
- Object Storage connection restrictions
- Multiple concurrent connection operations might fail
- Can't enable policies after catalog creation
- Assets are blocked if evaluation fails
- Only the browser back button takes you back to the metadata enrichment asset from the Default settings page
- Only the data class filter in metadata enrichment results is case-sensitive
- Filter options in metadata enrichment results might not be updated immediately
- Enrichment details for an asset might not reflect the settings applied on latest enrichment run
- Can't access individual pages in a metadata enrichment asset directly
- Incomplete details for an assigned data class in a column's enrichment results
- In some cases, you might not see the full log of a metadata enrichment job run in the UI
- Schema information might be missing when you filter enrichment results
- Issues with search on the Assets tab of a metadata enrichment asset
- Rules run on columns of type time in data assets from Amazon Redshift data source do not return proper results
List of Masking flow issues
List of Data Refinery issues
List of Watson Query issues
List of notebooks issues
- Duplicating a notebook doesn't create a unique name in the new projects UI
- Can't create assets in older accounts
- Error during login
- 500 internal server error received when launching Watson Studio
- Manual installation of some tensor libraries is not supported
- Connection to notebook kernel is taking longer than expected after running a code cell
- Using the predefined sqlContext object in multiple notebooks causes an error
- Connection failed message
- Hyperlinks to notebook sections don't work in preview mode
- Can't connect to notebook kernel
- Insufficient resources available error when opening or editing a notebook
List of machine learning issues
- Region requirements
- Accessing links if you create a service instance while associating a service with a project
- Deployment issues
- AutoAI known limitations
- Federated Learning assets cannot be searched in All assets, search results, or filter results in the new projects UI
- Data module not found in IBM Federated Learning
- Previewing masked data assets is blocked in deployment space
List of Cognos Dashboard Embedded issues
- CSV files containing duplicate column names are not supported
- Cognos dashboards can only use data connections created with username and password credentials
- Incorrect data type shown for refined data assets
- Unsupported special characters in CSV files
- String values in CSV files are limited to 128 characters
- Date format limitations for CSV files
- Can't replace a data table in a visualization
- Cognos Analytics features that are not supported
List of Watson OpenScale issues
List of SPSS Modeler issues
- SPSS Modeler runtime restrictions
- Error when trying to stop a running flow
- Imported Data Asset Export nodes sometimes fail to run
- Data preview may fail when table metadata changes
- Unable to view output after running an Extension Output node
- Unable to preview Excel data from IBM Cloud Object Storage connections
- Numbers interpreted as a string
- SuperNode containing Import nodes
- Exporting to a SAV file
- Migrating Import nodes
- Text Analytics settings aren't saved
- Merge node unicode characters
Issues with Cloud Object Storage
- List of machine learning issues
- Error with assets using Watson Machine Learning in projects specifying Cloud Object Storage with Key Protect enabled.
- Auto AI
- Federate Learning
- Watson Pipelines
- List of SPSS Modeler issues
- Unable to save model to project specifying Cloud Object Storage with Key Protect enabled.
- List of notebooks issues
- Unable to save model to project specifying Cloud Object Storage with Key Protect enabled.
Watson Knowledge Catalog
If you use the Watson Knowledge Catalog, you might encounter these known issues and restrictions when you use catalogs.
Synchronize the data policy service (DPS) category caches
For performance purpose, the data policy service (DPS) keeps a copy of glossary categories in caches. When categories are created, updated, or deleted, the glossary service publishes RabbitMQ events to reflect these changes. The DPS listens to these events and update the caches. However, in some rare occasions, the message might be lost when RabbitMQ service is down or too busy. The DPS provides a REST API utility to update the cache.
You can run the following REST API utility during downtime that has no category changes to help avoid unexpected enforcement results during the run and also avoid inaccurate cache updates:
curl -v -k -X GET --header "Content-Type: application/json"
--header "Accept: application/json"
--header "Authorization: Bearer ${token}"
"${uri}/v3/enforcement/governed_items/sync/category"
This REST API is available in the Watson Knowledge Catalog service.
Masked data is not supported in data visualizations
Masked data is not supported in data visualizations. If you attempt to work with masked data while generating a chart in the Visualizations tab of a data asset in a project the following error message is received: Bad Request: Failed to retrieve data from server. Masked data is not supported
.
Data is not masked in some project tools
When you add a connected data asset that contains masked columns from a catalog to a project, the columns remain masked when you view the data and when you refine the data in the Data Refinery tool. However, other tools in projects do not preserve masking when they access data through a connection. For example, when you load connected data in a Notebook, a DataStage flow, a dashboard, or other project tools, you access the data through a direct connection and bypass masking.
Predefined governance artifacts might not be available
If you don't see any predefined classifications or data classes, reinitialize your tenant by using the following API call:
curl -X POST "https://api.dataplatform.cloud.ibm.com/v3/glossary_terms/admin/initialize_content" -H "Authorization: Bearer $BEARER_TOKEN" -k
Add collaborators with lowercase email addresses
When you add collaborators to the catalog, enter email addresses with all lowercase letters. Mixed-case email addresses are not supported.
Object Storage connection restrictions
When you look at a Cloud Object Storage (S3 API) or Cloudant connection, the folder itself is listed as a child asset.
Multiple concurrent connection operations might fail
An error might be encountered when multiple users are running connection operations concurrently. The error message can vary.
Can't enable data protection rule enforcement after catalog creation
You cannot enable the enforcement of data protection rules after you create a catalog. To apply data protection rules to the assets in a catalog, you must enable enforcement during catalog creation.
Assets are blocked if evaluation fails
The following restrictions apply to data assets in a catalog with policies enforced: File-based data assets that have a header can't have duplicate column names, a period (.), or single quotation mark (') in a column name.
If evaluation fails, the asset is blocked to all users except the asset owner. All other users see an error message that the data asset cannot be viewed because evaluation failed and the asset is blocked.
Only the data class filter in metadata enrichment results is case-sensitive
When you filter metadata enrichment results on the Column tab, only the Data class entries are case-sensitive. The entries in the Business terms, Schemas, and Assets filters are all lowercase regardless of the actual casing of the value.
Filter options in metadata enrichment results might not be updated immediately
When you add assets, assign new data classes or business terms, or remove business terms, the respective filters aren't immediately updated. As a workaround, refresh your browser to see the updated filter lists.
Enrichment details for an asset might not reflect the settings applied on latest enrichment run
After you edit the enrichment options for a metadata enrichment that was run at least once, the asset details might show the updated options instead of the options applied in the latest enrichment run.
Can't access individual pages in a metadata enrichment asset directly
If the number of assets or columns in a metadata enrichment asset spans several pages, you can't go to a specific page directly. The page number drop-down list is disabled. Use the Next page and Previous page buttons instead.
Incomplete details for an assigned data class in a column's enrichment results
When you click the assigned data class on the Governance tab of the column details in the metadata enrichment results, a preview of the data class details is shown. However, the details are incomplete.
In some cases, you might not see the full log of a metadata enrichment job run in the UI
If the list of errors in a metadata enrichment run is exceptionally long, only part of the job log might be displayed in the UI.
Workaround: Download the entire log and analyze it in an external editor.
Schema information might be missing when you filter enrichment results
When you filter assets or columns in the enrichment results on source information, schema information might not be available.
Workaround: Rerun the enrichment job and apply the Source filter again.
Issues with search on the Assets tab of a metadata enrichment asset
When you search for an asset on the Assets tab of a metadata enrichment asset, no results might be returned. Consider these limitations:
- Search is case sensitive.
- The result contains only records that match the exact search phrase or start with the phrase.
Rules run on columns of type time in data assets from Amazon Redshift data source do not return proper results
For data assets from Amazon Redshift data sources, columns of type time are imported with type timestamp. You can't apply time-specific data quality rules to such columns.
Masking flow
If you use Masking flow, you might encounter these known issues and restrictions when you are privatizing data.
Masking flow jobs might fail
During a masking flow job, Spark might attempt to read all of a data source into memory. Errors might occur when there isn't enough memory to support the job. The largest volume of data that can fit into the largest deployed Spark processing node is approximately 12GBs.
Cannot save changes to the Masking character
When you create a new or edit an existing data protection rule to redact columns with a data class, you might experience trouble to save the changes from the default masking character of X
to any other character.
If you click Create (or Update) to save your rule before the new character is shown under the After column in the Example data section, your change to the Masking character does not save.
A workaround to save the rule with the new masking character:
Wait about 3 seconds until the data under the After column in the Example data section is redact with the new masking character, then you can
click Create (or Update) to save the rule with the new character.
Data Refinery
If you use Data Refinery, you might encounter these known issues and restrictions when you refine data.
Personal credentials are not supported for connected data assets in Data Refinery
If you create a connected data asset with personal credentials, other users must use the following workaround in order to use the connected data asset in Data Refinery.
Workaround:
- Go to the project page, and click the link for the connected data asset to open the preview.
- Enter credentials.
- Open Data Refinery and use the authenticated connected data asset for a source or target.
Cannot view jobs in Data Refinery flows in the new projects UI
If you're working in the new projects UI, you do not have the option to view jobs from the options menu in Data Refinery flows.
Workaround: In order to view jobs in Data Refinery flows, open a Data Refinery flow, click the Jobs icon , and select Save and view jobs. You can view a list of all jobs in your project on the Jobs tab.
Notebook issues
You might encounter some of these issues when getting started with and using notebooks.
Duplicating a notebook doesn't create a unique name in the new projects UI
When you duplicate a notebook in the new projects UI, the duplicate notebook is not created with a unique name.
Can't create assets in older accounts
If you're working in an instance of Watson Studio that was activated before November, 2017, you might not be able to create operational assets, like notebooks. If the Create button stays gray and disabled, you must add the Watson Studio service to your account from the Services catalog.
500 internal server error received when launching Watson Studio
Rarely, you may receive an HTTP internal server error (500) when launching Watson Studio. This might be caused by an expired cookie stored for the browser. To confirm the error was caused by a stale cookie, try launching Watson Studio in a private browsing session (incognito) or by using a different browser. If you can successfully launch in the new browser, the error was caused by an expired cookie. You have a choice of resolutions:
- Exit the browser application completely to reset the cookie. You must close and restart the application, not just close the browser window. Restart the browser application and launch Watson Studio to reset the session cookie.
- Clear the IBM cookies from the browsing data and launch Watson Studio. Look in the browsing data or security options in the browser to clear cookies. Note that clearing all IBM cookies may affect other IBM applications.
If the 500 error persists after performing one of these resolutions, check the status page for IBM Cloud incidents affecting Watson Studio. Additionally, you may open a support case at the IBM Cloud support portal.
Error during login
You might get this error message while trying to log in to Watson Studio: "Access Manager WebSEAL could not complete your request due to an unexpected error." Try to log in again. Usually the second login attempt works.
Manual installation of some tensor libraries is not supported
Some tensor flow libraries are preinstalled, but if you try to install additional tensor flow libraries yourself, you get an error.
Connection to notebook kernel is taking longer than expected after running a code cell
If you try to reconnect to the kernel and immediately run a code cell (or if the kernel reconnection happened during code execution), the notebook doesn't reconnect to the kernel and no output is displayed for the code cell. You need to manually reconnect to the kernel by clicking Kernel > Reconnect. When the kernel is ready, you can try running the code cell again.
Using the predefined sqlContext object in multiple notebooks causes an error
You might receive an Apache Spark error if you use the predefined sqlContext object in multiple notebooks. Create a new sqlContext object for each notebook. See this Stack Overflow explanation.
Connection failed message
If your kernel stops, your notebook is no longer automatically saved. To save it, click File > Save manually, and you should get a Notebook saved message in the kernel information area, which appears before the Spark version. If you get a message that the kernel failed, to reconnect your notebook to the kernel click Kernel > Reconnect. If nothing you do restarts the kernel and you can't save the notebook, you can download it to save your changes by clicking File > Download as > Notebook (.ipynb). Then you need to create a new notebook based on your downloaded notebook file.
Hyperlinks to notebook sections don't work in preview mode
If your notebook contains sections that you link to from an introductory section at the top of the notebook for example, the links to these sections will not work if the notebook was opened in view-only mode in Firefox. However, if you open the notebook in edit mode, these links will work.
Can't connect to notebook kernel
If you try to run a notebook and you see the message Connecting to Kernel
, followed by Connection failed. Reconnecting
and finally by a connection failed error message, the reason might be that your firewall is blocking
the notebook from running.
If Watson Studio is installed behind a firewall, you must add the WebSocket connection wss://dataplatform.cloud.ibm.com
to the firewall settings. Enabling this WebSocket connection is required when you're using notebooks and RStudio.
Machine learning issues
You might encounter some of these issues when working with machine learning tools.
Region requirements
You can only associate a Watson Machine Learning service instance with your project when the Watson Machine Learning service instance and the Watson Studio instance are located in the same region.
Accessing links if you create a service instance while associating a service with a project
While you are associating a Watson Machine Learning service to a project, you have the option of creating a new service instance. If you choose to create a new service, the links on the service page might not work. To access the service terms, APIs, and documentation, right click the links to open them in new windows.
Federated Learning assets cannot be searched in All assets, search results, or filter results in the new projects UI
You cannot search Federated Learning assets from the All assets view, the search results, or the filter results of your project.
Workaround: Click the Federated Learning asset to open the tool.
Deployment issues
- A deployment that is inactive (no scores) for a set time (24 hours for the free plan or 120 hours for a paid plan) is automatically hibernated. When a new scoring request is submitted, the deployment is reactivated and the score request is served. Expect a brief delay of 1 to 60 seconds for the first score request after activation, depending on the model framework.
- For some frameworks, such as SPSS modeler, the first score request for a deployed model after hibernation might result in a 504 error. If this happens, submit the request again; subsequent requests should succeed.
AutoAI known limitations
-
Currently, AutoAI experiments do not support double-byte character sets. AutoAI only supports CSV files with ASCII characters. Users must convert any non-ASCII characters in the file name or content, and provide input data as a CSV as defined in this CSV standard.
-
To interact programmatically with an AutoAI model, use the REST API instead of the Python client. The APIs for the Python client required to support AutoAI are not generally available at this time.
Data module not found in IBM Federated Learning
The data handler for IBM Federated Learning is trying to extract a data module from the FL library but is unable to find it. You might see the following error message:
ModuleNotFoundError: No module named 'ibmfl.util.datasets'
The issue possibly results from using an outdated DataHandler. Please review and update your DataHandler to conform to the latest spec. Here is the link to the most recent MNIST data handler or ensure your sample versions are up-to-date.
Previewing masked data assets is blocked in deployment space**
A data asset preview may fail with this message:
This asset contains masked data and is not supported for preview in the Deployment Space
Deployment spaces currently don't support masking data so the preview for masked assets has been blocked to prevent data leaks.
Cognos Dashboard Embedded issues
You might encounter some of these issues when working with a Cognos Dashboard Embedded.
CSV files containing duplicate column names are not supported
Cognos Dashboard Embedded does not support CSV files that contain duplicate column names. Duplicates are case-insensitive. For example, BRANCH_NAME
, branch_name
, and Branch_Name
are considered duplicate
column names.
Cognos dashboards can only use data connections created with username and password credentials
Cognos Dashboard Embedded requires that database connections and connected data assets added as data sources to a dashboard must include username and password credentials.
If these credentials are not specified in the connection and a token or API key is used instead, then Cognos Dashboard Embedded cannot use that connection or connected data asset as a data source.
Incorrect data type shown for refined data assets
After you import a CSV file, if you click on the imported file in the data asset overview page, types of some columns might not show up correctly. For example, a dataset of a company report with a column called Revenue that contains the revenue of the company might show up as type String, instead of a number-oriented data type that is more logical.
Unsupported special characters in CSV files
The source CSV file name can contain non-alphanumeric characters. However, the CSV file name can't contain the special characters / : & < . \ "
. If the file name contains these characters, they are removed from the
table name.
String values in CSV files are limited to 128 characters
String values in a column in your source CSV file can be only 128 characters long. If your CSV file has string columns with values that are longer, an error message is displayed.
Date format limitations in CSV files
There are date format limitations for CSV files used in visualizations. For details, see Resolving problems when using data from CSV files in Cognos Dashboard Embedded.
Can't replace a data table in a visualization
When you add a visualization to a dashboard, you cannot add a data table to the visualization if you previously added (and then removed) data fields from another data table. This restriction applies to Db2, CSV tables, and other data sources.
Cognos Analytics features that are not supported
The following functionality from IBM Cognos Analytics is not supported in dashboards:
- Data grouping
- Custom color palettes
- Custom visualizations
- Assistant
- Forecasting
- Insights in visualization
- Jupyter notebook visualization
- Advanced data analytics
Watson OpenScale issues
You might encounter the following issues in Watson OpenScale:
Drift configuration is started but never finishes
Drift configuration is started but never finishes and continues to show the spinner icon. If you see the spinner run for more than 10 minutes, it is possible that the system is left in an inconsistent state. There is a workaround to this behavior: Edit the drift configuration. Then, save it. The system might come out of this state and complete configuration. If drift reconfiguration does not rectify the situation, contact IBM Support.
SPSS Modeler issues
You might encounter some of these issues when working in SPSS Modeler.
SPSS Modeler runtime restrictions
Watson Studio does not include SPSS functionality in Peru, Ecuador, Colombia and Venezuela.
Error when trying to stop a running flow
When running an SPSS Modeler flow, you might encounter an error if you try to stop the flow from the Environments page under your project's Manage tab. To completely stop the SPSS Modeler runtime and CUH consumption, close the browser tabs where you have the flow open.
Imported Data Asset Export nodes sometimes fail to run
When you create a new flow by importing an SPSS Modeler stream (.str file), migrate the export node, and then run the resulting Data Asset Export node, the run may fail. To work around this issue: rerun the node, change the output name and change the If the data set already exists option in the node properties, then run the node again.
Data preview may fail if table metadata has changed
In some cases, when using the Data Asset import node to import data from a connection, data preview may return an error if the underlying table metadata (data model) has changed. Recreate the Data Asset node to resolve the issue.
Unable to view output after running an Extension Output node
When running an Extension Output node with the Output to file option selected, the resulting output file returns an error when you try to open it from the Outputs panel.
Unable to preview Excel data from IBM Cloud Object Storage connections
Currently, you can't preview .xls or .xlsx data from a IBM Cloud Object Storage connection.
Numbers interpreted as a string
Any number with a precision larger or equal to 32 and a scale equal to 0 will be interpreted as a string. If you need to change this behavior, you can use a Filler node to cast the field to a real number instead by using the expression to_real(@FIELD).
SuperNode containing Import nodes
If your flow has a SuperNode that contains an Import node, the input schema may not be set correctly when you save the model with the Scoring branch option. To work around this issue, expand the SuperNode before saving.
Exporting to a SAV file
When using the Data Asset Export node to export to an SPSS Statistics SAV file (.sav), the Replace data asset option won't work if the input schema doesn't match the output schema. The schema of the existing file you want to replace must match.
Migrating Import nodes
If you import a stream (.str) to your flow that was created in SPSS Modeler desktop and contains one or more unsupported Import nodes, you'll be prompted to migrate the Import nodes to data assets. If the stream contains multiple Import nodes that use the same data file, then you must first add that file to your project as a data asset before migrating because the migration can't upload the same file to more than one Import node. After adding the data asset to your project, reopen the flow and proceed with the migration using the new data asset.
Text Analytics settings aren't saved
After closing the Text Analytics Workbench, any filter settings or category build settings you modified aren't saved to the node as they should be.
Merge node and unicode characters
The Merge node treats the following very similar Japanese characters as the same character.
Watson Pipelines known issues
The issues pertain to Watson Pipelines.
Nesting loops more than 2 levels can result in pipeline error
Nesting loops more than 2 levels can result in an error when you run the pipeline, such as Error retrieving the run. Reviewing the logs can show an error such as text in text not resolved: neither pipeline_input nor node_output
.
If you are looping with output from a Bash script, the log might list an error like this: PipelineLoop can't be run; it has an invalid spec: non-existent variable in $(params.run-bash-script-standard-output)
. To resolve the
problem, do not nest loops more than 2 levels.
Asset browser does not always reflect count for total numbers of asset type
When selecting an asset from the asset browser, such as choosing a source for a Copy node, you see that some of the assets list the total number of that asset type available, but notebooks do not. That is a current limitation.
Cannot delete pipeline versions
Currently, you cannot delete saved versions of pipelines that you no longer need.
Deleting an AutoAI experiment fails under some conditions
Using a Delete AutoAI experiment node to delete an AutoAI experiment that was created from the Projects UI does not delete the AutoAI asset. However, the rest of the flow can complete successfully.
Cache appears enabled but is not enabled
If the Copy assets Pipelines node's Copy mode is set to Overwrite
, cache is displayed as enabled but remains disabled.
Watson Pipelines limitations
These limitations apply to Watson Pipelines.
- Single pipeline limits
- Limitations by configuration size
- Input and output size limits
- Batch input limited to data assets
Single pipeline limits
These limitation apply to a single pipeline, regardless of configuration.
- Any single pipeline cannot contain more than 120 standard nodes
- Any pipeline with a loop cannot contain more than 600 nodes across all iterations (for example, 60 iterations - 10 nodes each)
Limitations by configuration size
Small configuration
A SMALL configuration supports 600 standard nodes (across all active pipelines) or 300 nodes run in a loop. For example:
- 30 standard pipelines with 20 nodes run in parallel = 600 standard nodes
- 3 pipelines containing a loop with 10 iterations and 10 nodes in each iteration = 300 nodes in a loop
Medium configuration
A MEDIUM configuration supports 1200 standard nodes (across all active pipelines) or 600 nodes run in a loop. For example:
- 30 standard pipelines with 40 nodes run in parallel = 1200 standard nodes
- 6 pipelines containing a loop with 10 iterations and 10 nodes in each iteration = 600 nodes in a loop
Large configuration
A LARGE configuration supports 4800 standard nodes (across all active pipelines) or 2400 nodes run in a loop. For example:
- 80 standard pipelines with 60 nodes run in parallel = 4800 standard nodes
- 24 pipelines containing a loop with 10 iterations and 10 nodes in each iteration = 2400 nodes in a loop
Input and output size limits
Input and output values, which include pipeline parameters, user variables, and generic node inputs and outputs, cannot exceed 10 KB of data.
Batch input limited to data assets
Currently, input for batch deployment jobs is limited to data assets. This means that certain types of deployments, which require JSON input or multiple files as input, are not supported. For example, SPSS models and Decision Optimization solutions that require multiple files as input are not supported.
Issues with Cloud Object Storage
These issue apply to working with Cloud Object Storage.
Issues with Cloud Object Storage when Key Protect is enabled
Key Protect in conjunction with Cloud Object Storage is not supported for working with Watson Machine Learning assets. If you are using Key Protect, you might encounter these issues when you are working with assets in Watson Studio.
- Training or saving these Watson Machine Learning assets might fail:
- Auto AI
- Federated Learning
- Watson Pipelines
- You might be unable to save an SPSS model or a notebook model to a project