Designing data classes (IBM Knowledge Catalog)
When you design a data class, you must decide whether to enable data matching for this data class, which business terms or classifications it should be related to, and whether to define hierarchical relationships between data classes.
- Properties of data classes
- Relationships between data classes
- Relationships with other types of governance artifacts
- Working with data classes
- Required permissions
- To author a data class, you must have this user permission:
- - Access governance artifacts
- Additionally, you must have one of these category collaborator roles in the primary category for the data class:
- Admin
- Owner
- Editor
- A custom role with the permission to create data classes.
For more information, see Required permissions.
Properties of data classes
Data classes have these standard properties that are similar to other governance artifacts.
Property or behavior | Supports? | Explanation |
---|---|---|
Must have unique names? | Yes | Data class names must be unique within a category. |
Description? | Yes | Optional. Include a description to help users find this data class. |
Add relationships to other data classes? | Yes | See Relationships between data classes. |
Add relationships to other types of governance artifacts? | Yes | See Relationships with other types of governance artifacts. |
Add relationship to asset? | Yes | See Asset relationships in catalogs. |
Add custom properties? | Yes | See Custom properties and relationships for governance artifacts and catalog assets. |
Add custom relationships? | Yes | See Custom properties and relationships for governance artifacts and catalog assets. |
Organize in categories? | Yes | The primary category for the artifact determines who can view or modify the artifact. See Categories. |
Import from a file? | Yes | See Importing governance artifacts. |
Import from a Knowledge Accelerator? | No | |
Export to a file? | Yes | See Exporting governance artifacts. |
Managed by workflows? | Yes | See Workflows. |
Specify effective start and end dates? | Yes | See Effective dates. |
Assign a Steward? | Yes | See Stewards. |
Add tags as properties? | Yes | See Tags. |
Assign to an asset? | No | |
Assign to a column in a data asset? | Yes | A data class can be added to a column in a data asset both manually and automatically. |
Automated assignment to assets during profiling or enrichment? | Yes | See Managing metadata enrichment |
Predefined artifacts? | Yes | See Predefined data classes. |
Add regular expression (regex ) patterns? |
Limited | Some custom data classes with regular expression patterns might fail to run masking flows or cannot preview example of the masked data. For example, you cannot use capture groups such as ([abc]) , but you can use non-capture
groups (?:[abc]) . |
Relationships between data classes
You can use hierarchies to create relationships between data classes.
For the currently processed data class, you can define the following relationships to other data classes within the same category:
- Parent data class
- Dependent data classes
The parent data class is used to organize the data class in parent/children relationships. It also acts as a kind of "pre-filter" if an automatic matching data method is used: If a parent data class has a matching data method, the data matching methods for the children data classes will only be evaluated if the data matching method for the parent data class returned a positive match. This means that if you define a parent data class it has an impact on the criteria used by the data classification process to decide whether the data class should be assigned or not to an analyzed data field.
Example:
- US License - parent data class
- Georgia State Driver's License - dependent data class
Relationships with other types of governance artifacts
You can add the following related artifacts:
- Classifications
- Business terms
The classifications and business terms that you add are suggestions for columns to which the data class is assigned.
When you add relationships between data classes and business terms, those business terms are automatically assigned to assets when their related data classes are assigned during metadata enrichment. For example, a data class Email address can be related to a business term Contact method. When the metadata enrichment process detects a column that matches the data class Email address, both the data class Email address and the business term Contact method are assigned. See Automatic term assignment.
However, a data class is not automatically assigned when one of its related business terms is assigned to a column.
You can include data classes in data protection rules to identify the type of data to control.
Working with data classes
To create a data class:
- Open Governance > Data classes.
- Click New data class to create a new data class and provide the required information. Data classes can have the same name if they are in different categories.
- Click Save as draft. The data class in a draft state is now ready for refining as listed in the following section.
- When ready, click Publish or Send for approval depending on your workflow definition.
To edit an existing data class:
- Open a data class and click or edit next to the field you want to change.
- Click Save as draft. The data class in a draft state is now ready for refining.
- Click Publish or Send for approval depending on your workflow definition.
You can provide the following information to define your data class:
-
Add an example for the data class in the Example property. If you specify a data class named
City-New
, the example could beLondon
. -
Assign this data class to a primary category and optionally to secondary categories.
-
Edit custom properties that provide additional information in the Details section.
Custom properties can be created as described in Custom properties and relationships for governance artifacts and catalog assets. If any custom relationship types are defined, they are also shown here. Inverse relationships show up in the other artifact after you publish the artifact where you created the relationship. -
Use data matching to organize database columns and data file fields for review and subsequent column analysis work. For example, database columns with numeric data typically include numbers within a range of valid values.
-
Enable or disable a data class for auto-assignment. To enable data class, you need to enable data matching. A data class with data matching method enabled is treated as enabled data class and a data class where data matching method is disabled is treated as a disabled data class.
-
Choose the matching priority of a data class to determine which data class candidate should become the inferred data class of a field. Only data classes with confidence above threshold are considered. See Priority.
-
Specify related artifacts. You can select only the business terms and classifications that have been published. The classifications and business terms that you add here are suggestions for columns to which the data class is assigned. You can assign one or more classifications at a column level.
-
Add other related content.
Depending on the effective dates that are set for the data class, it is active or inactive. Active data classes can be used to specify actions, for example, classifying data automatically. Inactive data classes do not contribute to any action until they become active.
You cannot use draft data classes to specify data matching or for any other action. By default, the data class is published if you send it for approval.
You can also create additional data classes based on one of the reference data sets available in Knowledge Accelerators by using the data matching method. See Reference data sets in Knowledge Accelerators.
Learn more
Parent topic: Data classes