Data protection rules define attribute-based access control of data. You can create data protection rules to define how to protect sensitive data based on the identity of the user and properties or characteristics of the data. A data protection rule is evaluated for enforcement when a user accesses an asset in a governed catalog. Enforcement of the rule can affect the appearance of the data and whether the data asset can be moved out of the catalog for use.
Data protection rules apply to data assets in governed catalogs, and under some conditions, in projects and Data virtualization. Data protection rules are automatically enforced when a catalog member attempts to view or act on a data asset in a governed catalog to prevent unauthorized users from accessing sensitive data. However, if the user who is trying to access the asset in a catalog is the owner of the asset (by default, the user who created the asset), then unrestricted access is always granted.
A data protection rule consists of criteria and an action block. Criteria identifies what data to control and can include who is requesting access to the data and the properties of the data asset. The criteria can consist of a number of predicates that are combined in a Boolean expression. The predicates can include user attributes and asset properties, such as the data classes, classifications, tags, or business terms that are assigned to the asset. The action block specifies how to control the data. The action block can consist of binary actions, such as denying access to data, and data transformative actions, such as masking the data values in a column or filtering rows:
- Deny access to data
- Affected users can't preview any data values or use the data asset. Applies to any type of data asset.
- Redact columns
- Affected users see values replaced with a string of one repeated character. Applies to data assets with relational data.
- Obfuscate columns
- Affected users see data replaced with similar values and in the same format. Applies to data assets with relational data.
- Substitute columns
- Affected users see data replaced with a hashed value. Applies to data assets with relational data.
- Filter rows
- Affected users see a subset of rows in the data set. Applies to data assets with relational data.
For example, you can create a data protection rule to deny access to data in data assets that contain confidential information for all users except Joe Blue. The definition of that rule consists of a criteria with two conditions and an action:
Rewritten as a sentence, the rule definition is:
If the user is not [email protected] and if any data is classified as Confidential, then deny access to the data in the asset.
For relational data assets, you can also create rules to mask data in asset columns, based on the assigned governance artifacts or other properties of the column. For example, you can define a rule to mask email addresses so that users can view the all the data in an asset except the data values for email addresses, which are replaced with generated values.
Without data protection rules, access to a data asset in a catalog is restricted by the data asset's privacy setting within that catalog, and limited to the users who are collaborators in the catalog.