0 / 0
Data fabric solution overview

Data fabric solution overview

When you implement the data fabric solution on Cloud Pak for Data, you can solve the challenges of data access, data quality, data governance, and managing your data and AI lifecycles.

The data fabric solution on Cloud Pak for Data provides these main capabilities for managing and automating your data and AI lifecycles:

Data access
Access your data across multiple clouds and on-premises in your existing data architecture.
Self-service consumption
Share and use data and other assets from across the enterprise in catalogs.
Accumulated knowledge
Understand your data through a common business vocabulary. Trust your data through history, lineage, and quality analysis.
Collaborative innovation
Collaborate with others to discover insights. Prepare data, analyze data, and build models with a set of integrated tools for all levels of experience.
Governance and compliance
Define rules to enforce data privacy. Track and document the detailed history of AI models to ensure compliance.
Unified lifecycle
Automate the building, testing, deploying, and monitoring of data pipelines and AI models.

The following illustration shows how the data fabric supports use cases on the Cloud Pak for Data platform by integrating access to hybrid data sources with capabilities in a single UI experience.

Image showing a data fabric with level of use cases, capabilities, data sources, and platform

The value of assets

With the data fabric, you can transform data into assets that accumulate meaning and value. Assets are more than just data. When you first create a connection to a data source, you have basic information about how to access the data, the tables, schemas, and data values. You start adding value while you ingest data by virtualizing, transforming, or replicating it in workspaces called projects.

When you curate the data, you add metadata to your data assets. You profile the data to classify it and compile statistics about the values. You enrich assets with business vocabulary that describe the semantic meaning of the data for your organization. You analyze data quality. The metadata that you add during curation is considered active metadata, because it is generated automatically through machine learning processes. When you rerun curation after your data changes, the metadata is updated based on automated data analysis.

As users use the assets in projects, they create the third level of meaning that describes the history of how the asset is used and the relationships between assets. Users can analyze the data in notebooks or dashboards, or train machine learning models.

Users can also add information to assets, such as, ratings and reviews, visualizations of the data, tags, and other relationships.

The following image shows how data assets accumulate value in a data fabric.

Image showing how a data asset accumulates value

Models are also assets. You can track deployments and input data for the model, comparisons between models, compliance with corporate protocols, and other performance metrics.

Use cases

Cloud Pak for Data as a Service provides four use cases as parts of the data fabric solution. You implement the data fabric as represented in each use case by creating one or more service instances that provide features and tools. Some services are included in multiple use cases.

Image showing the four data fabric use cases

Use cases represent ways to implement part of the data fabric solution so that your teams can start working while you build out other parts. You can start with any use case and add the others as you need them:

  • If you have a more mature data governance model, start by establishing your business vocabulary, as described in the Data governance use case.
  • If you want quicker time-to-value, start with data virtualization or data science, as described in the Data integration and the Data Science and MLOps use cases.
  • If you need to ensure that your models are compliant with your organization's goals and regulations, start tracking your models, as described in the AI governance use case.

Explore each use case to learn about what you can accomplish and the tools you can use.

Data governance

Implement governance based on metadata that provides business knowledge and defines data protection. Provide high-quality data assets in self-service catalogs. Automate enforcement of data governance for regulatory compliance.

Services for this use case: Watson Knowledge Catalog and IBM Match 360 with Watson.

See Data governance use case.

Data integration

Simplify and automate access to all your data, without moving it. Orchestrates data across a distributed landscape to create a network of instantly available information for data consumers.

Services for this use case: Watson Query, DataStage, and Watson Knowledge Catalog.

See Data integration use case.

Data Science and MLOps

Operationalize data analysis and model creation with an automated workflow that prepares data, builds, deploys, monitors, and retrains models.

Services for this use case: Watson Studio, Watson Machine Learning, Watson OpenScale, and Watson Knowledge Catalog.

See Data Science and MLOps use case.

AI governance

Operationalize AI governance with an automated workflow that enforces fairness, quality, and explainability in your models.

Services for this use case: Watson Studio, Watson Machine Learning, Watson OpenScale, and Watson Knowledge Catalog.

See AI governance use case.

Learn more

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more