Configure and run metadata enrichment to add several layers of metadata to your data assets.
You can create a data profile to classify a data asset and compile statistics about the values that it contains. Augment your assets with AI-generated alternative column names and descriptions for data assets and the columns that they contain. Use predefined data quality checks for an initial quality assessment of your data. Enrich assets with business vocabulary that describes the semantic meaning of the data for your organization. Identify relationships between data assets.
You can also create metadata enrichments with APIs instead of the user interface. The links to these APIs are listed in the Learn more section.
To create a metadata enrichment asset and a job for enriching data:
-
Open a project and click New asset > Enrich data assets with metadata. After you create the first metadata enrichment in this way, you can add new metadata enrichment assets from the project's Asset page.
-
Define details:
- Specify a name for the metadata enrichment.
- Optional: Provide a description.
- Optional: Select or create tags to be assigned to the metadata enrichment asset to simplify searching. You can create new tags by entering the tag name and pressing Enter.
-
Set the initial data scope.
Select the data assets that you want enrich from Data assets. See Initial data scope. Review the selected scope before you proceed. You can directly delete assets from the data scope or you can rework the entire scope by clicking Edit data scope.
You can skip this step to create an empty metadata enrichment asset, and set the scope later.
-
Define the objective of this metadata enrichment asset. You can add several layers of metadata to a data asset:
- Profile the data to classify it and compile statistics about the values.
- Add alternative names and AI-generated descriptions.
- Enrich assets with business vocabulary that describes the semantic meaning of the data for your organization.
- Run predefined data quality checks for an initial quality assessment.
- Identify primary keys and key relationships.
-
Select categories to determine the business vocabulary that can be applied during the enrichment. See Category selection.
-
Select a sampling type. See Sampling.
-
Define when the enrichment job is run. You can run the enrichment manually at any time. See Run definition.
-
Select the data scope for the reruns of the enrichment, whether scheduled or run manually. See Scope of reruns of the enrichment.
-
Review the metadata enrichment configuration. To make changes, click the Edit icon on the tile and update the settings.
-
Click Create. The metadata enrichment asset is added to the project and a metadata enrichment job is created. For more information, see Managing enrichment jobs.
Depending on the run definition, the enrichment might run immediately after you create the metadata enrichment asset.
After the enrichment is complete, you can access a high-level overview of the enrichment results by viewing the metadata enrichment asset. From there, you can drill down into and work with the results for each asset. See Working with the enrichment results.
Metadata enrichment is run on assets that are available in the project. Thus, the list of enriched assets might not correspond to the configured scope of included metadata import assets in these cases:
- Metadata import was not yet complete when the enrichment started.
- Metadata import failed for a set of assets or failed completely.
When metadata enrichment is run on a large number of data assets, it can happen that processing fails for a subset of the data assets. For each asset that couldn't be enriched, an error message is written to the log of the metadata enrichment job so that you can identify those assets. You can then rerun enrichment on the assets for which processing failed.
For information about how to update, rerun, or delete a metadata enrichment, see Managing an existing metadata enrichment.
Learn more
Next steps
Parent topic: Managing metadata enrichment