Watson Knowledge Catalog on Cloud Pak for Data as a Service
Watson Knowledge Catalog, a core service of Cloud Pak for Data as a Service, includes a secure enterprise catalog management platform that provides high-quality data assets that are easy to find. The platform is supported by a data governance framework that you use to enrich assets with metadata.
Watson Knowledge Catalog is part of Cloud Pak for Data as a Service and provides the data governance and privacy capabilities of the data fabric architecture.
You develop a knowledge core by curating data assets and enriching them with governance artifacts that describe their properties and meaning. Data stewards and data engineers curate data by importing metadata, preparing the data assets, enriching the data assets by assigning governance artifacts, and publishing the assets into catalogs. Some governance artifacts are predefined and are automatically assigned to data assets. Data stewards can create or import a business vocabulary to further enrich data assets during data curation. Knowledge Accelerators provide sets of ready to use business vocabulary for specific industries. You use categories to control who can create and use governance artifacts for what purpose.
You can create data protection rules that define how to protect data. Data protection rules are enforced automatically in a uniform manner in governed catalogs. You can configure data protection rules to mask sensitive data based on the content, format, or meaning of the data, or the identify of the users who access the data. When you mask data, you unlock the data for users who are not authorized to view sensitive data and avoid the need to maintain multiple copies of the data.
You provide a self-service way to find and share assets across your enterprise with catalogs:
- Collaborators in a catalog have access to data assets without needing separate credentials or being able to see the credentials. Collaborators have roles that control what activities they can perform in the catalog.
- Data assets contain information about how to access the data, data classifications, assigned business terms and other governance artifacts, relationships with other assets, and rating and reviews. Data assets can be relational data or unstructured data, such as PDF or Microsoft Office documents.
- Other types of assets in catalogs include operational assets, which data scientists create with tools to work with data, such as, models, notebooks, and dashboards.
- Semantic search based on data asset metadata and properties and AI-powered recommendations help users find the data that they need.
Data scientists find assets in catalogs and then copy the assets into projects where they analyze data and build models with Watson Studio and Watson Machine Learning tools.
|Watson Query||Integrate data sources across multiple types and locations into one logical data view.|
|IBM Match 360 with Watson (Beta)||Get a consolidated, central view of your organization's key business facts, and manage master data throughout its lifecycle.|
|Watson Studio||Prepare, analyze, and model data in a collaborative environment with tools for data scientists, developers, and domain experts.|
|DataStage®||Use built-in search, automatic metadata propagation, and simultaneous highlighting of compilation errors to create, edit, load, and run jobs that transform and tailor information for your enterprise.|
Compatible data sources
See Connection types for a list of data source services that are compatible.