Data fabric solution overview
When you implement the data fabric solution on Cloud Pak for Data, you can solve the challenges of data access, data quality, data governance, and managing your data and AI lifecycles.
The data fabric solution on Cloud Pak for Data provides these main capabilities for managing and automating your data and AI lifecycles:
Access your data across multiple clouds and on-premises in your existing data architecture.
Share and use data and other assets from across the enterprise in catalogs.
Understand your data through a common business vocabulary. Trust your data through history, lineage, and quality analysis.
Collaborate with others to discover insights. Prepare data, analyze data, and build models with a set of integrated tools for all levels of experience.
Data governance, security, and compliance
Automatically enforce uniform data privacy across the platform.
Automate the building, testing, deploying, and monitoring of data pipelines and AI models.
The value of assets
With the data fabric, you can transform data into assets that accumulate meaning and value. Assets are more than just data. When you first create a connection to a data source, you have basic information about how to access the data, the tables, schemas, and data values. You start adding value while you ingest data by virtualizing, transforming, or replicating it in workspaces called projects.
When you curate the data, you add metadata to your data assets. You profile the data to classify it and compile statistics about the values. You enrich assets with business terms that describe the semantic meaning of the data for your organization. You analyze data quality. When you publish the assets into a catalog to share them with your organization, the assets are automatically protected by the rules that you create to control who can access which data.
As users find data assets in catalogs and use the assets in projects, they create the third level of meaning that describes the history of how the asset is used, the lineage of the data, and the relationships between assets. Users can synthesize data into a 360 view of customers, analyze the data in notebooks or dashboards, or train machine learning models.
The following image shows how data assets increase in value through the data fabric.
Models are also assets. You can track deployments and input data for the model, comparisons between models, and compliance with corporate protocols.
Cloud Pak for Data as a Service provides four use cases as parts of the data fabric solution. You implement the data fabric as represented in each use case by creating one or more service instances that provide features and tools. Some services are included in multiple use cases.
Use cases represent ways to implement part of the data fabric solution so that your teams can start working while you build out other parts. You can start with any use case and add the others as you need them:
- If you have a more mature governance model, start by building a governance foundation, as described in the data governance and privacy use case.
- If you want quicker time-to-value, start with data virtualization or data science, as described in the multicloud data integration and the MLOps and trustworthy AI use cases.
- If you are focused on a customer-centric transformation, start consolidating your customer data, as described in the customer 360 use case.
Explore each use case to learn about what you can accomplish and the tools you can use.
Multicloud data integration
Simplify and automate access to all your data, without moving it. Orchestrates data across a distributed landscape to create a network of instantly available information for data consumers.
Services for this use case: Watson Query, DataStage, and Watson Knowledge Catalog.
Data governance and privacy
Implement governance based on metadata that provides business knowledge and defines data protection. Provide high-quality data assets in self-service catalogs. Automate enforcement of data governance for regulatory compliance.
Service for this use case: Watson Knowledge Catalog.
Create a comprehensive view of your customers that is augmented by AI-driven insights to enable smarter customer interactions.
Services for this use case: IBM Match 360 with Watson and Watson Knowledge Catalog.
See Customer 360.
MLOps and trustworthy AI
Build and operationalize AI with an automated and governed workflow that enforces fairness, quality, and explainability in your models.
Services for this use case: Watson Studio, Watson Machine Learning, Watson OpenScale, and Watson Knowledge Catalog.
- Multicloud data integration use case
- Data governance and privacy use case
- Customer 360 use case
- MLOps and trustworthy AI use case
- Data fabric tutorials
- What is a data fabric?
Parent topic: Cloud Pak for Data as a Service