0 / 0
Governing virtual data in Watson Query

Governing virtual data in Watson Query

Watson Query can integrate with IBM Knowledge Catalog to govern the virtual data that you publish to governed catalogs. Data governance involves applying business context, data policies, and data protection rules to your virtual data.

Before you begin, review the end-to-end process.

  1. Understand catalogs. See Catalogs.
  2. Choose a method for publishing your assets to a catalog. See Methods for publishing assets to the catalog.
  3. Understand the ownership of an asset that is published to a catalog. See Virtual object owner and data asset owner.
  4. Decide to use business terms or data protection rules. See Applying business terms and data protection rules.

Catalogs

IBM Knowledge Catalog is a secure enterprise data catalog management platform. With IBM Knowledge Catalog, you use catalogs to easily find and share your data assets. A catalog is a way to organize, label, and search for data assets. An asset in a catalog consists of metadata about a data asset. Data protection rules are enforced only on data that is published or added to a catalog. For more information about catalogs and data assets, see Catalogs.

Watson Query can use multiple catalogs. There are two publishing methods: Enforced publish method and Standard publish method.
  • Enforced publishing method is normally used for control and governance of assets.
  • Standard publishing method is used to facilitate sharing of virtualized data for easier collaboration.
If you have the Watson Query Manager or Engineer role, you can publish virtual data to a governed catalog automatically. To automatically publish virtual data, enable Enforce publishing to a governed catalog in the Service settings > Governance page. You must ensure that your instance of Watson Query is provisioned in the same account as your instance of IBM Knowledge Catalog to automatically publish to a governed catalog. For more information, see Publishing virtual data to the catalog in Watson Query.

Methods for publishing assets to the catalog

In Watson Query, you can use two methods to publish assets to a catalog. You can choose to enforce publishing of all assets to a primary catalog or you can allow users to choose to publish to any catalog that they have the Manager or Editor role for.

Enforced publishing method

If you want to enforce publishing to a primary catalog, a Watson Query Manager must enable Enforce publishing to a governed catalog in Service settings > Governance and choose a primary catalog that all virtualized objects that are created with the user interface will be published to. If this setting is enabled, users will not be able to choose the catalog that they publish to when they virtualize data. All assets will be published to the primary catalog automatically.

To change a primary catalog, a Watson Query Manager must satisfy the following requirements:

  • They must be an Manager on the current primary catalog.
  • They must be an Manager on the newly selected primary catalog.
Note: If you enforce publishing to a primary catalog, the service ID is added as a collaborator on the catalog in the background. The service ID performs the automatic publishing. Therefore, if you enforce publishing to a primary catalog, the service ID will take up one catalog collaborator spot from your plan quota.

Do not remove this service ID from the catalog. It is required for automatic publishing to the primary catalog. The service ID will appear as Unavailable user in your selected primary catalog collaborators list and will have the Manager role assigned.

Standard publishing method

If publishing to a primary catalog is not enforced, a user can choose to publish to any catalog that they have the Manager or Editor role for. The user can choose the catalog from the drop-down list on the Virtualize page.

For more information, see Publishing virtual data to the catalog in Watson Query.

Virtual object owner and data asset owner

When a virtual object is published to a catalog, this object becomes represented by a data asset in IBM Knowledge Catalog. There is a difference between virtual object owners and data asset owners:
Virtual object owner
The user that created the virtual object in Watson Query.
Data asset owner
The user that owns the asset for a virtual object in a catalog.
  • For example, a user might choose not to publish a Watson Query object when it is virtualized. Or the object might have been created by a method that does not automatically attempt to publish the object, such as when the user runs SQL to create a view. The object is then shared with other users. One of those users might publish the object and then that user would become the asset owner instead of the original object creator.
  • Or, the asset owner might be modified in the catalog to change the asset owner.
Asset owners are exempt from IBM Knowledge Catalog data protection rules and policies.

Applying business terms and data protection rules

You can create virtual tables in Watson Query from existing data assets that have business term assignments. Watson Query can use business terms assigned to tables in the catalog to rename table and column names while these tables are being virtualized.

Note: Review the limitations of your data sources that might impact business term assignment. See Supported data sources in Watson Query.

A catalog data asset contains a set of properties that includes business terms and tags. After your virtual data is in a catalog, you can:

  • Assign business terms, data classes, and tags that are authored in IBM Knowledge Catalog to tables and columns and thus, form a logical structure of your virtual data.

    For more information, see Virtualizing data with business terms.

  • Use data protection rules to deny access to your virtual data or mask it. These data protection rules can be based on the assigned tags and business terms. For more information, see Data protection rules.
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more