Reference data
Reference data sets provide logical groupings of code values (reference data values), such as product codes and country codes. These codes are typically sets of allowed values that are associated with data fields and can be assigned to business terms. You create reference data sets in Watson Knowledge Catalog so that enterprise standards can be accessed centrally by users or by consuming applications through APIs. Reference data sets can also be used to provide the matching pattern for data classes, allowing data fields to be automatically classified through data profiling and discovery. These data classes can then be used in data quality analysis to evaluate the quality and consistency of the values in data columns.
Reference data helps you, for example, define a standard set of values for certain fields. It can be useful to create a standard definition of country codes and use this reference data to ensure that country code fields comply. Different designations such as “US”, “USA”, “United States”, and "America" can all be resolved to the same reference data value. As a result you can get much more consistent data.
Predefined reference data sets are also provided. They include the physical location and sovereign location values for data assets so that you can control data access based on location with data location rules.
You can create hierarchies for reference data sets. Hierarchies make searches for reference data sets easier and faster. For example, if you were searching data sets but no relationship information was available, then you would need to remember the data set context and search the data sets one at a time. However, with hierarchy information you can start with a specific data set and navigate through all its related sets in and around the context of that data set only.
You can also create relationships between values in a reference data set and values in one or more different reference data sets. These relationships are known as value mappings or cross walks.
Setting up relationships this way can help you to understand more easily how values interconnect across reference data sets and reduce the time that you might use searching for these values manually. For example, in the following image you can see that the value United States of America maps to two different values in a different reference data set (soybean farming and agriculture) and another country value of India maps to a currency value in yet another reference data set.
For information on setting up related values, see Importing files for reference data sets.
Learn more
Parent topic: Governance artifacts (new)