0 / 0
Creating metadata imports
Last updated: Dec 13, 2024
Creating metadata imports

You can import technical metadata and lineage metadata to add data assets to a project or a catalog. In a project, you can prepare and analyze the data before you publish it to a catalog.

Import metadata into a project as data assets to prepare and analyze the data before you publish it to a catalog. Profile such data assets, analyze data quality, and assign terms to provide business context by running metadata enrichment. To do a deeper quality analysis, run data quality rules on the data assets. You can also add data assets to a catalog directly if the data is ready to be shared without further preparation. Import lineage metadata to see where your data comes from, how it changes and where it flows.

Supported connections
See the Metadata import column in Supported connectors. You can use APIs instead of the user interface to retrieve the list of supported connections or to create a metadata import asset. The links to these APIs are listed in the Learn more section.
Required permissions
To create, manage, and run a metadata import, you must have these roles and permissions:
  • The Admin or the Editor role in the project.
  • The Admin or the Editor role in the catalog to which you want to import or publish the assets.
  • Access to the connections to the data sources of the data assets to be imported and the SELECT or a similar permission on the corresponding databases.
  • Manage data lineage permission to import lineage metadata.

Prerequisites

Before you start creating a metadata import for a new data source, complete the following steps:

  1. Create a data source definition.
  2. Create a connection to the data source in a project.

Each data source requires various connection details. You can find this information in each connection topic in the Connectors section. For more information about data source definition and connections in the context of metadata import, see Designing metadata imports: Data source.

Creating a metadata import asset and importing metadata

To create a metadata import asset and a job for importing metadata into a project or a catalog:

  1. Open a project, go to the project's Asset page and click New asset > Import metadata for data assets.

  2. Specify a name for the metadata import. Optionally, you can provide a description.

  3. Select tags to be assigned to the metadata import asset to simplify searching. You can create new tags by entering the tag name.

  4. Select the import goal. You can select one or both goals. See Import goal.

  5. If you selected the Import asset metadata goal, select the import target. You can import metadata into the project that you're working in or to any catalog for which you have an editor or admin role. See Import target.

  6. Provide details for the data source for your metadata import. Data source definition is required when you import lineage metadata. When you import asset metadata, select either a data source definition or a connection. Depending on the data source, you might need to select a scanner as well. See Data source.

  7. Define a scope for the metadata import. See Scope of import. Depending on the size and contents of your data source, you might not want to import all assets but a select subset. You can include complete schemas or folders, or drill down to individual tables or files. When you select a schema or a folder, you can immediately see how many items it contains. Thus, you can decide whether you want to include the whole set or whether a subset serves your purpose better.

    When you import lineage metadata, you can change the scope of data in the following ways:

    • Select specific objects in the data source, for example schemas or reports.
    • Add external inputs in a .zip file with more data that is relevant to lineage.
    • Add metadata from a file system or a Git repository.

    Optionally, you can define placeholder replacements for external inputs for a better lineage analysis. Click Configure and define details. See Placeholder replacements.

  8. Define whether you want to run scheduled import jobs. If you don't set a schedule, you run the import when you save the metadata import asset. You can rerun the import manually at any time. See Scheduling options.

  9. If you import lineage metadata, you can decide which lineage phases to run. See Lineage import phases.

  10. Customize the import behavior. You can choose to prevent specific properties from being updated and to delete existing assets that are not included in the reimport. See Advanced import options.

  11. Review the metadata import configuration. To make changes, click the Edit icon edit icon on the tile and update the settings.

  12. Click Create. The metadata import asset is added to the project, and a metadata import job is created. If you didn't configure a schedule, the import is run immediately. If you configured a schedule, the import runs on the defined schedule.

    Important: Assets from the same connection that were already imported through a different metadata import are not imported anew but are updated. Such assets do no longer show up in the initial metadata import. Only the most recently run metadata import contains the assets.

Depending on the outcome of the metadata import job run, a completion message or an error notification is displayed.

A completion message is displayed when the job run completed successfully, completed with warnings, or completed with errors. An error notification is displayed if the entire job run failed. Either type of notification contains a link to the job run log that provides details about the specific job run.

When the import is complete, you can see the list of assets with the following information:

  • The asset name, which provides a link to the asset in the project or catalog.
  • The asset type, such as Data or Report. For data assets, also the format, such as Relational table, is shown. For other asset types, the format column shows a dash (—).
  • The asset context, such as the parent or file path.
  • The date and time that the asset was last imported.
  • The import status, which can be Imported for successfully imported data, In progress, or Removed if the asset couldn't be reimported.

You can work with most imported data assets in the same way as with connected data assets. Imported assets have a tag automatically assigned that reflects the asset's parent if applicable.

To profile, analyze, and provide business context to imported data assets, create a metadata enrichment asset and include the metadata import asset in the data scope.

Learn more

Next steps

Parent topic: Importing metadata

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more