0 / 0
Import methods for governance artifacts
Last updated: Mar 18, 2024
Import methods for governance artifacts

You can import governance artifacts with a file. You can import one type of governance artifact at a time, or import all governance artifacts from another IBM Knowledge Catalog instance.

Compatibility between deployment environments

You can export and then import governance artifacts between IBM Knowledge Catalog instances on the following deployment environments:

  • Cloud Pak for Data 3.5
  • Cloud Pak for Data 4.x
  • Cloud Pak for Data as a Service

The values of Stewards are not compatible between IBM Knowledge Catalog instances Cloud Pak for Data as a Service and Cloud Pak for Data 3.5 or 4.x.

You can import governance artifacts from IBM InfoSphere Information Governance Catalog to IBM Knowledge Catalog instances on Cloud Pak for Data 3.5 and 4.x. To import governance artifacts from IBM InfoSphere Information Governance Catalog to IBM Knowledge Catalog instances on Cloud Pak for Data as a Service, you must edit each CSV file to conform to the format of the IBM Knowledge Catalog artifact CSV files. For example, you might need to make the following types of edits:

  • Remove unsupported columns
  • Separate different artifact types into multiple CSV files
  • Modify supported columns
  • Add required columns

Comparison of import methods

Choose the appropriate import method for your goals and circumstances.

Import a single type of artifact

You can import a single type of governance artifact at a time with a CSV file.

This method is useful in the following types of circumstances:

  • You want the imported artifacts to be subject to workflow.
  • You want to add values for a property to one type of governance artifact. Export that artifact type as a CSV file, edit the CSV file, and then import it. For example, you can use this method to add a custom attribute to your business terms.
  • You want to define artifacts in another program. Create CSV files for each artifact type. For example, you can use this method to define artifacts in a spreadsheet program and then import them.

See Importing governance artifacts by type with CSV files and CSV file format for importing governance artifacts.

Import multiple types of artifacts

You can import multiple types of governance artifacts with a ZIP file that you created by exporting multiple types of existing governance artifacts from a IBM Knowledge Catalog instance. The ZIP file contains CSV files for categories and every exported artifact type. The CSV files match the format for the CSV import file, except for:

  • The extra Artifact ID column, which contains identifiers for artifacts instead of identifying artifacts by name and category path.
  • Related artifacts are defined with artifact IDs instead of context and name.

This method is useful in the following types of circumstances:

  • You want to move all governance artifacts from one IBM Knowledge Catalog instance to another.

See Importing multiple types of governance artifacts from an instance with a ZIP file.

The following table summarizes the differences between importing artifacts with CSV files or a ZIP file.

Characteristics CSV file ZIP file
File creation • Export one type of existing artifacts
• Create a file in a spreadsheet program
• Export artifacts from IBM InfoSphere Information Governance Catalog and adjust the format
Export multiple types of artifacts from an instance
Number of artifact types Categories or one artifact type per file. Multiple types of artifacts, with categories and each type of artifact in a separate CSV file.
Import methods • Through the UI
• API request
API request
Workflow All artifacts are imported as draft and are subject to workflow. Categories are published immediately because they are not subject to workflow. All artifacts and categories are published immediately.
Required permissions Permissions to create or edit categories. You must be at least an Editor in the category you are importing to. For details see Required permissions. The Manage glossary permission

Governance artifacts that you can import

With both import methods, you can import categories and the following types of governance artifacts:

Restrictions:

  • You can import values for all properties of these types of governance artifacts, including relationships with other artifacts. However, relationships are imported only when the related artifact exists or is defined in the same import process. To add relationships that the import process skipped, first publish all imported draft artifacts and then run the import process again.
  • You can't use CSV to move governance artifacts and their relationships between Cloud Pak for Data instances. For example, if you try to export data classes with matching method Match to reference data to CSV, and then import it into another Cloud Pak for Data instance, the import fails, because Artifact ID is not included in CSV imports and exports. Use ZIP import instead.
  • When importing a reference data set from a CSV file, the reference data values from that set are not imported. You must use a separate CSV to import the values into the data set. Alternatively, you can use a ZIP import to import both the reference data set and its reference data values. For more information, see Importing files for reference data sets.
  • You can't import data protection rules or data location rules.

Methods for merging imported and existing artifacts

Whether you import artifacts with CSV files or a ZIP file, you must choose what happens when you import governance artifacts that already exist and the values of the properties are different. The following table summarizes the three merge methods.

Merge method API Effect on original values Effect on imported values
Replace all values merge_option=all Discard all original values. Accept all imported values, even empty values.
Replace with defined values merge_option=specified Retain original values if imported values are empty. Accept all imported values, except empty values.
Replace empty values merge_option=empty Retain original values, except empty values. Accept only imported values that replace empty values.

For new artifacts, each of these methods produces the same results.

Replace all values

All the original values of the artifact are discarded and replaced by the values of the imported artifact. If the value of a property for the imported artifact is empty, any original values for that property are removed.

For example, suppose you have a published business term that is named release and you import a CSV file to modify it. The following table shows the effect of the Replace all values option:

Property Original values Values in the CSV file Resulting values
Name release release release
Artifact type glossary_term glossary_term glossary_term
Category marketing marketing marketing
Description example term example term edited example term edited
Tags beta beta
Related terms marketing>>version marketing>>date marketing>>date
Classifications

Confidential

The resulting draft artifact has these changes to the original values:

  • The original description is replaced by a new description.
  • The original empty value for tags is replaced by a value.
  • The original related term is replaced by a new related term.
  • The original classification value is replaced by an empty value.
Note:

When using all merge option, you must ensure that all CSV content is consistent regarding relationships between artifacts. For example, if the ZIP import file contains both a term and a data class connected together with a relationship, then this relationship must be present in both data classes CSV and terms CSV. Otherwise the relationship import behavior is unpredictable, the relationship may be imported or not.

When importing ZIP files that contain reference data values, you must always use merge_option=all in the API call.

Replace with defined values

Original and empty values of the artifact are replaced by the supplied values of the imported artifact. If the value of a property for the imported artifact is empty, any original values for that property are retained.

For example, suppose you have a published business term that is named release and you import a CSV file to modify it. The following table shows the effect of the Replace with defined values option:

Property Original values Values in the CSV file Resulting values
Name release release release
Artifact type glossary_term glossary_term glossary_term
Category marketing marketing marketing
Description example term example term edited example term edited
Tags beta beta
Related terms marketing>>version marketing>>date marketing>>date
Classifications

Confidential

Confidential

The resulting draft artifact has these changes to the original values:

  • The original description is replaced by a new description.
  • The original empty value for tags is replaced by a value.
  • The original related term is replaced by a new related term.

Replace empty values

Empty values of the original artifact are replaced by the supplied values of the imported artifact.

For example, suppose you have a published business term that is named release and you import a CSV file to modify it. The following table shows the effect of the Replace empty values option:

Property Original values Values in the CSV file Resulting values
Name release release release
Artifact type glossary_term glossary_term glossary_term
Category marketing marketing marketing
Description example term example term edited example term
Tags beta beta
Related terms marketing>>version marketing>>date marketing>>version
Classifications

Confidential

Confidential

The resulting draft artifact has this change to the original values:

  • The original empty value for tags is replaced by a value.

Security considerations

Governance data exported to CSV files is sanitized against known CSV Injection attacks, to be safe for those spreadsheet programs which automatically interpret CSV data. As a result, any text values which start with one of following characters:

  • equals to (=)
  • plus (+)
  • minus (-)
  • at (@)

are prefixed by a single quote character ('). To make the functionality consistent, imported CSV files are additionally parsed to automatically remove the single quote character ('). Sanitizing also applies when importing and exporting governance artifacts to ZIP files, as they contain CSV files.

To disable this functionality:

  1. Edit IBM Knowledge Catalog Glossary Service deployment:

    oc edit deployment wkc-glossary-service
    
  2. Set the environment variable ESCAPE_FORMULAS_IN_CSV_FILES to value false.

For more information, see CSV Injection.

Learn more

Parent topic: Managing governance artifacts