0 / 0
Governing virtual data with data protection rules in Data Virtualization
Last updated: Nov 26, 2024
Data protection rules in Data Virtualization

You can govern your virtual data by defining data protection rules.

Before you begin

These instructions assume that you completed the following prerequisites.

About this task

Data protection rules specify what data to control by, for example, denying access or masking data. You can add data protection rules to policies and enforce these policies in Data Virtualization.

You can use following types of data protection rules:

Deny of access
Deny of access prevents users from accessing all the data of a Data Virtualization asset. For example, if the Data steward doesn’t want to expose the entire asset to one user, they can define this rule with a condition that matches the username.
Data masking
Data masking is used to hide sensitive data but still allow users to use the asset. There are three types of data masking rules: redact, substitute, and obfuscate. The user can decide to enable one of these rules based on how to use the data in the upstream application.
  • Redaction replaces all or a subset of characters in a data cell.
  • Substitute replaces data with the salted hashes of the original values. This method is the most likely to maintain referential integrity.
  • Obfuscate replaces data with formatted values that are similar to the original data.
Row-level filtering

You can create data protection rules to include or exclude rows in your virtualized data to limit the rows that users can see. For example, you can define a rule so that employees can see customer data that is associated only with their department.

You can apply filter criteria to include or exclude rows.

You cannot apply data masking and row filtering rules to views directly. The result sets of a view is masked according to the data protection rules that apply to the objects that are referenced by the view. You can filter rows or mask identifying details from tables that are referenced in the view definition.

Access to the tables that are referenced in the row-level filter expressions is not evaluated, including data masking.

Row filtering rules that apply to Data Virtualization assets and reference other assets must reference Data Virtualization assets only. If you query an object and the row filtering rules reference assets that are not Data Virtualization assets, the query fails with the following error:

The statement failed because a Big SQL component encountered an  error. Component
      receiving the error: "SCHEDULER". Component returning  the error: "SCHEDULER". Log entry
      identifier:  "[SCL-0-<log_entry_id>]".. SQLCODE=-5105, SQLSTATE=58040

You can confirm the cause of the error by running the following query:

select line from table(syshadoop.log_entry('SCL-0-<log_entry_id_from_error>'))

For more information, see Filtering rows in data protection rules.

Important:

Data Virtualization access control is not applied when data masking or row-level filtering applies to the preview in Watson services (other than Data Virtualization). The Data Virtualization internal access controls, which are controlled by using Manage access in the Data Virtualization UI, do not apply to the preview from the other Watson services with masking or row-level filtering. You must define your rules to manage access to the catalogs, projects, data assets, or connections for access control in the other Watson services.

When you publish virtualized data assets to a catalog, they are treated like any other data asset and are subject to data protection rules. Data protection rules can deny or mask access to assets based on criteria that can include governance artifacts, such as business terms and data classes.

Procedure

To govern your virtual data with data protection rules:

  1. Virtualize your data and publish it to a governed catalog.
    See Publishing virtual data to the catalog in Data Virtualization for details.

    This data is automatically profiled if it is configured in the catalog settings. The profile of a catalog data asset includes generated metadata and statistics about the textual content of the data. Profiling automatically assigns data classes to table columns. See Profiles.

  2. Govern your virtual data in the catalog:
    • Assign data classes, business terms, and tags that are authored in IBM Knowledge Catalog to your virtual tables and columns. See Managing business terms for details on how to manage and author business terms in IBM Knowledge Catalog.
    • Use data protection rules to allow or deny access to a virtual table. Data Virtualization users can see the virtual table but cannot preview the contents of the table or perform any actions on the table or its columns when a deny rule applies to the data asset. See Managing data protection rules for details on how to create data protection rules in IBM Knowledge Catalog.

      A lock icon (Lock icon) to the table name on the Virtualized data page indicates that access to the data in the table is denied by a data protection rule. A lock icon (Lock icon) might also appear when an asset in the catalog has not been profiled and is pending data class assignment.

      To access views, these conditions must be met.

      • You have the required permissions to access views.
      • The creator of the view has the required permissions to access the objects referenced by the view.
    • To see an asset preview in a catalog, these conditions must be met.
      • You are not blocked by any data protection rules. If you are the owner of the asset, you can’t be blocked by data protection rules.
      • If the asset has an associated connection, these conditions must also be true:
        • You are not blocked from accessing the connection by any data protection rules.
        • The username in the connection details has access to the object at the data source.

      For more information, see Asset previews.

    • Use a data protection rule to mask data in columns or filter rows of a virtual table. Use data masking rules to disguise the original data. Depending on the method of data masking, data is redacted, substituted, or obfuscated. See Masking data with data protection rules for details.

      A lock icon (Lock icon) next to the column name indicates that the data in the column is masked by a data protection rule.

      Data masking has certain limitations in Data Virtualization. See Masking virtual data.

    Note:

    By default, the user who created the asset in the catalog is the asset owner. Catalog asset owners are exempt from data protection rules, but are subject to the Data Virtualization access control. If the user who is accessing a virtual object is also the owner of the corresponding asset in IBM Knowledge Catalog, the data protection rules and policies that are defined for that user in IBM Knowledge Catalog are not enforced in Data Virtualization.

    When a virtual object is added to a catalog, and you have at least one masking or row filtering data protection rule or a data protection rule that is based on a data class, access to it will be denied until its profiling and assignment of data classes completes.

    The username that is specified in the Data Virtualization connection must also be authorized to access the object in Data Virtualization, unless a data masking rule applies to the asset and the previewing user.

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more