Watson Knowledge Catalog on Cloud Pak for Data as a Service


Watson Knowledge Catalog, a core service of Cloud Pak for Data as a Service, includes a secure enterprise catalog management platform that provides high-quality data assets that are easy to find. The platform is supported by a data governance framework that you use to ensure that data access is compliant with your business rules and standards by preventing access to sensitive information by unauthorized users.

Watson Knowledge Catalog is part of Cloud Pak for Data as a Service and provides the data governance and privacy capabilities of the data fabric architecture.

A diagram depicting how Watson Knowledge Catalog fits into the service architecture for Cloud Pak for Data as a Service.

You develop a knowledge core by curating data assets and enriching them with governance artifacts that describe their properties and meaning. Data stewards and data engineers curate data by importing metadata, preparing the data assets, enriching the data assets by assigning governance artifacts, and publishing the assets into catalogs. Some governance artifacts are predefined and are automatically assigned to data assets. Data stewards can create or import a business vocabulary to further enrich data assets during data curation. Knowledge Accelerators provide sets of ready to use business vocabulary for specific industries. You use categories to control who can create and use governance artifacts for what purpose.

You can create data protection rules that protect data across the platform. Data protection rules are applied automatically in a uniform manner. You can configure data protection rules to mask sensitive data based on the content, format, or meaning of the data, or the identify of the users who access the data. When you mask data, you unlock the data for users who are not authorized to view sensitive data and avoid the need to maintain multiple copies of the data.

You provide a self-service way to find and share assets across your enterprise with catalogs:

  • Collaborators in a catalog have access to data assets without needing separate credentials or being able to see the credentials. Collaborators have roles that control what activities they can perform in the catalog.
  • Data assets contain information about how to access the data, data classifications, assigned business terms and other governance artifacts, relationships with other assets, and rating and reviews. Data assets can be relational data or unstructured data, such as PDF or Microsoft Office documents.
  • Other types of assets in catalogs include operational assets, which data scientists create with tools to work with data, such as, models, notebooks, and dashboards.
  • Semantic search based on data asset metadata and properties and AI-powered recommendations help users find the data that they need.

Data scientists find assets in catalogs and then copy the assets into projects where they analyze data and build models with Watson Studio and Watson Machine Learning tools.

Quick links

Integrated services

Table 1. Supplemental services. You can extend the functionality of this service with the following supplemental services, which each require that this service be installed.
Service Capability
Watson Query Integrate data sources across multiple types and locations into one logical data view.
IBM Match 360 with Watson (Beta) Get a consolidated, central view of your organization's key business facts, and manage master data throughout its lifecycle.
Table 2. Related services. The following related services are often used with this service and provide complementary features, but they are not required.
Service Capability
Watson Studio Prepare, analyze, and model data in a collaborative environment with tools for data scientists, developers, and domain experts.

Compatible data sources

See Connection types for a list of data source services that are compatible.