Use cases overview
You can use Cloud Pak for Data with different services to implement use cases that help you build a trusted data foundation for your AI operations.
Data fabric solution overview
When you implement the data fabric solution on Cloud Pak for Data, you can solve the challenges of data access, data quality, data governance, and managing your data lifecycles.
The data fabric solution on Cloud Pak for Data provides these main capabilities for managing and automating your data lifecycles:
- Data access
- Access your data across multiple clouds and on-premises in your existing data architecture.
- Self-service consumption
- Share and use data and other assets from across the enterprise in catalogs.
- Accumulated knowledge
- Understand your data through a common business vocabulary. Trust your data through history, lineage, and quality analysis.
- Collaborative innovation
- Collaborate with others to discover insights. Prepare and analyze data with a set of integrated tools for all levels of experience.
- Governance and compliance
- Define rules to enforce data privacy.
- Unified lifecycle
- Automate the building, testing, deploying, and monitoring of data pipelines.
The following illustration shows how the data fabric supports use cases on the Cloud Pak for Data platform (both cloud and on premises) by integrating access to hybrid data sources (such as, data lakehouse, data warehouse, data lake, database, or business application) with capabilities in a single UI experience.
The value of assets
With the data fabric, you can transform data into assets that accumulate meaning and value. Assets are more than just data. When you first create a connection to a data source, you have basic information about how to access the data, the tables, schemas, and data values. You start adding value while you ingest data by virtualizing, transforming, or replicating it in workspaces called projects.
When you curate the data, you add metadata to your data assets. You profile the data to classify it and compile statistics about the values. You enrich assets with business vocabulary that describe the semantic meaning of the data for your organization. You analyze data quality. The metadata that you add during curation is considered active metadata because it is generated automatically through machine learning processes. When you rerun curation after your data changes, the metadata is updated based on automated data analysis.
As users use the assets in projects, they create the third level of meaning that describes the history of how the asset is used and the relationships between assets. Users can analyze the data in notebooks or dashboards, or train machine learning models.
Users can also add information to assets, such as, ratings and reviews, visualizations of the data, tags, and other relationships.
The following image shows how data assets accumulate value in a data fabric by adding descriptive information (data profile, data quality, and business vocabulary), usage information (actions on the data and relationships), and user-added information (ratings and reviews, visualizations, and relationships) to the basic information (type of data, format and schema, and where the data resides) about the data asset.
Data fabric use cases
Cloud Pak for Data provides several use cases as part of the data fabric solution. You implement the data fabric as represented in each use case by installing one or more services that provide features and tools. Some services are included in multiple use cases.
Use cases represent ways to implement part of the data fabric solution so that your team can start working while you build out other parts. You can start with any use case and add the others as you need them:
- If you have a more mature data governance model, start by establishing your business vocabulary, as described in the Data governance use case.
- If you want quicker time-to-value, start with data virtualization or data replication, as described in the Data integration use case.
- If you need to ensure that your users and systems have a total, trusted, and unified view of your customer data, start by matching and consolidating your record data into discrete entities, as described in the Master data management use case.
Explore each use case to learn about what you can accomplish and the tools you can use.
Data governance
Implement governance based on metadata that provides business knowledge and defines data protection. Provide high-quality data assets in self-service catalogs. Automate enforcement of data governance for regulatory compliance.
Service for this use case: IBM Knowledge Catalog.
Data integration
Simplify and automate access to all your data, without moving it. Orchestrates data across a distributed landscape to create a network of instantly available information for data consumers.
Services for this use case: Watson Query, DataStage, and IBM Knowledge Catalog.
Master data management
Build a consolidated view of customers and record data by connecting data across domains and matching it to create master data entities.
Service for this use case: IBM Match 360 with Watson.
Build and govern AI use cases
When you implement build and govern AI use cases, you can solve the challenges of building models, AI governance, and managing your AI lifecycles.
The build and govern AI use cases on Cloud Pak for Data provide these main capabilities for managing and automating your AI lifecycles:
- Collaborative innovation
- Collaborate with others to discover insights. Prepare data, analyze data, and build models with a set of integrated tools for all levels of experience.
- Governance and compliance
- Track and document the detailed history of AI models to help ensure compliance.
- Unified lifecycle
- Automate the building, testing, deploying, and monitoring of AI models.
Cloud Pak for Data as a Service provides two build and govern AI use cases. You implement each use case by creating one or more service instances that provide features and tools. Some services are included in multiple use cases.
You can start with either use case and add the other one as you need it:
- If you want quicker time-to-value, start with data science, as described in the Data Science and MLOps use case.
- If you need to ensure that your models are compliant with your organization's goals and regulations, start tracking your models, as described in the AI governance use case.
Explore each use case to learn about what you can accomplish and the tools you can use.
Data Science and MLOps
Operationalize data analysis and model creation with an automated workflow that prepares data, builds, deploys, monitors, and retrains models.
Services for this use case: Watson Studio, Watson Machine Learning, Watson OpenScale, and IBM Knowledge Catalog.
AI governance
Operationalize AI governance with an automated workflow that enforces fairness, quality, and explainability in your models.
Services for this use case: Watson Studio, Watson Machine Learning, Watson OpenScale, and IBM Knowledge Catalog.
Learn more
- What is a data fabric?
- Signing up for the data fabric trials
- Data governance use case
- Data integration use case
- Master data management use case
- AI governance use case
- Data Science and MLOps use case
- What is data observability?
- Use case tutorials
Parent topic: Overview of Cloud Pak for Data as a Service