0 / 0
Limitations and known issues in Data Virtualization
Last updated: Nov 26, 2024
Limitations and known issues in Data Virtualization

The following limitations and known issues apply to Data Virtualization.

Governing data

Access control issues when you preview assets with masked or filtered data
When you preview, download, or refine Data Virtualization data assets in Watson services other than Data Virtualization, in Cloud Pak for Data (for example, IBM Knowledge Catalog, Watson Studio, and Data Refinery), and in cases when data masking or row-level filtering applies, the preview is subject to the data protection rules and catalog or project access control only. Data Virtualization access controls are not enforced.

Data Virtualization access control is not applied when data masking or row-level filtering applies to the preview in Watson services (other than Data Virtualization). The Data Virtualization internal access controls, which are controlled by using Manage access in the Data Virtualization UI, do not apply to the preview from the other Watson services with masking or row-level filtering. You must define your rules to manage access to the catalogs, projects, data assets, or connections for access control in the other Watson services.

Automatic publishing of virtual objects to the catalog is limited to certain objects

Only objects that are created in the user interface are automatically published to the catalog. Objects that are created using SQL are not published automatically and must be published to the catalog manually or by using the API.

Cannot see column business terms for virtual object

You are virtualizing a table in Data Virtualization and you want to see the list of business term assignments on the Virtualize page. However, in the default virtualization mode, you cannot see any column term assignments; and in the strict virtualization mode, you cannot see table A on the Virtualize page.

You might encounter this issue when column business terms are assigned multiple times in a governed catalog. For example, you add a data asset for table A with its table and column term assignments in a governed catalog. Then, in the same governed catalog, you add a data asset for the same table A with its table and column term assignments. As a result, you might encounter this issue in Data Virtualization.

To avoid this issue in the default virtualization mode, don't assign column business terms multiple times in a governed catalog.

Access to a table is denied by policies

You cannot access a table but according to the data policies and authorizations, you are authorized to access this table. This issue occurs only if IBM Knowledge Catalog policy enforcement is enabled in Data Virtualization.

To solve this issue, see Access to a table is denied by policies in Data Virtualization.

Do not use duplicate assets for the same table

The policy service is unable to decide which of the duplicated assets to use for policy enforcement and does not aggregate the rules. You must avoid duplicate assets across governed catalogs as this might lead to issues with policy enforcement behavior in Data Virtualization.

Cannot access assets in the catalog

When you try to access Data Virtualization assets in IBM Knowledge Catalog, the access is denied.

To solve this issue, see Cannot access assets in the catalog in Data Virtualization.

Cannot enforce policies and data protection rules

You enabled policy enforcement but policies and data protection rules are not being enforced in Data Virtualization.

To solve this issue, see Cannot enforce policies and data protection rules in Data Virtualization.

Profiling of data assets in Data Virtualization fails

When you try to profile catalog assets from Data Virtualization in IBM Knowledge Catalog, you might see a SCAPIException:CDICO0103E message. You are not authorized and the message indicates Connection authorization failure occurred.

Ensure that all prerequisite setup steps are completed to authorize the IBM Knowledge Catalog service to access data in your Data Virtualization instance. See Profiling catalog assets fails with SCAPIException: CDICO0103E message in Data Virtualization.

Cannot publish data to data science notebooks in Watson Studio

Publishing data to data science notebooks in Watson Studio is not supported.

Data sources

Japanese column names are not displayed correctly
When you virtualize JSON files with Japanese data on IBM® Cloud Object Storage, the Japanese column names might be translated to hex values. The allownonalphanumeric option can be used to resolve this issue. However, the allownonalphanumeric option is disabled by default and you must contact IBM® Cloud support to open a ticket to have the option enabled.
Cannot connect to Generic S3 or Microsoft Azure Data Lake Storage
These connection types appear in the user interface when you click Data > Data virtualization > Add connection > New connection. However, these connection types are not supported.
Cannot connect to a data source with a Generic JDBC connection

Connecting to an unsupported data source by creating a Generic JDBC connection is not supported.

Virtualizing data

Tables in a MongoDB data source might be missing when you virtualize

When you create a connection to MongoDB, you only see tables that were created in the MongoDB data source before the connection was added.

For example, if you have 10 tables in your MongoDB data source when you create a connection, you see 10 tables when you start to virtualize the table. If a user adds new tables into the MongoDB data source after the connection is added and before you click Virtualize, Data Virtualization won't display the new tables under the Virtualize tab.

Workaround: To see all recently added virtualized MongoDB tables, delete the connection to MongoDB and recreate the connection.

Cannot assign a join view to a data request

The data request workflow is not supported.

Cannot create a virtualized table from files on remote data sources

Creating virtualized tables from files such as CSV, TSV, and Excel files on remote data sources by using a remote connector is not supported. You can create a virtualized table from files in IBM Cloud Object Storage. For more information, see Creating a virtualized table from files in Cloud Object Storage in Data Virtualization.

Connections

Accessing Data Virtualization using a URL displays an error
When you attempt to access Data Virtualization using a URL rather than through the Cloud Pak for Data home page, the resulting page displays the error message The data cannot be displayed..
Workaround: Log into Cloud Pak for Data and then navigate to Data > Data virtualization.
Personal credentials are not supported in data source connections from Data Virtualization
When you create connections from Data Virtualization to data sources, you can use shared credentials only. Personal credentials are not supported.
Service level connections that are deleted must be manually removed from the Platform connections page

If you add a service level data source connection on the Data virtualization > Data sources page, that connection also appears on the Platform connections page. Later, if you click Remove to delete the service level connection, the connection remains on the Platform connections page. You must manually remove the connection from the Platform connections page to completely remove the data source connection.

Service level connections must be updated from the same place that they were added

If you add a service level data source connection on the Data virtualization > Data sources page, you must update the connection from the same place. Any updates that are made to the connection on the Platform connections are not reflected in the service level connection.

Query fails due to unexpectedly closed connection to data source

Data Virtualization does not deactivate the connection pool for that data source when your instance runs a continuous workload against virtual tables from a particular data source. Instead, Data Virtualization waits for a period of complete inactivity before it deactivates the connection pool. The waiting period can create stale connections in the connection pool that get closed by the data source service and lead to query failures.

Workaround: Check the properties for persistent connection (keep-alive parameter) for your data sources. You can try two workarounds:

  • Consider disabling the keep-alive parameter inside any data sources that receive continuous workload from Data Virtualization.
  • You can also decrease the settings for corresponding Data Virtualization properties, RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC and RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC, as shown in the following examples: 

    CALL DVSYS.SETCONFIGPROPERTY('RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC', '10', '', ?, ?);    -- default 20s, minimum 5s
    CALL DVSYS.SETCONFIGPROPERTY('RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC, '30', '', ?, ?);    -- default 120s, minimum 5s
    Decreasing the RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC and RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC settings might help if there are small gaps of complete inactivity that were previously too short for the Data Virtualization shrink and deactivate timeouts to take effect.

Users and groups

When you add or edit a user in User management, the role might be not granted successfully because of a timeout
When the user logs in to Data Virtualization, the user interface shows a message that indicates that the user is locked. For example, Your user ID "dv_ibmid_270000ead8" is locked. To unlock this account, click unlock or go to User management and click Unlock in the overflow menu. You cannot resolve this issue by unlocking the user in the user interface. A Data Virtualization Manager must grant the role to the user manually. For example, run the following command:
db2 grant role dv_admin to dv_ibmid_270000ead8