Data preparation | IBM Cloud Pak for Data as a Service

Jupyter notebook editor

Prepare data

Visualize data

Build models

Deploy assets

Create a notebook in which you run Python, R, or Scala code to prepare, visualize, and analyze data, or build a model.

AutoAI

Build models

Automatically analyze your tabular data and generate candidate model pipelines customized for your predictive modeling problem.

SPSS Modeler

Prepare data

Visualize data

Build models

Create a visual flow that uses modeling algorithms to prepare data and build and train a model, using a guided approach to machine learning that doesn’t require coding.

Decision Optimization

Build models

Visualize data

Deploy assets

Create and manage scenarios to find the best solution to your optimization problem by comparing different combinations of your model, data, and solutions.

Data Refinery

Prepare data

Visualize data

Create a flow of ordered operations to cleanse and shape data. Visualize data to identify problems and discover insights.

Orchestration Pipelines

Prepare data

Build models

Deploy assets

Automate the model lifecycle, including preparing data, training models, and creating deployments.

RStudio

Prepare data

Build models

Deploy assets

Work with R notebooks and scripts in an integrated development environment.

Federated learning

Build models

Create a federated learning experiment to train a common model on a set of remote data sources. Share training results without sharing data.

Deployments

Deploy assets

Monitor models

Deploy and run your data science and AI solutions in a test or production environment.

Catalogs

Catalog data

Governance

Find and share your data and other assets.

Metadata import

Prepare data

Catalog data

Governance

Import asset metadata from a connection into a project or a catalog.

Metadata enrichment

Prepare data

Catalog data

Governance

Enrich imported asset metadata with business context, data profiling, and quality assessment.

Data quality rules

Prepare data

Governance

Measure and monitor the quality of your data.

Masking flow

Prepare data

Create and run masking flows to prepare copies of data assets that are masked by advanced data protection rules.

Governance

Create your business vocabulary to enrich assets and rules to protect data.

Data lineage

Governance

Track data movement and usage for transparency and determining data accuracy.

AI factsheet

Governance

Monitor models

Track AI models from request to production.

DataStage flow

Prepare data

Create a flow with a set of connectors and stages to transform and integrate data. Provide enriched and tailored information for your enterprise.

Data virtualization

Prepare data

Create a virtual table to segment or combine data from one or more tables.

OpenScale

Monitor models

Measure outcomes from your AI models and help ensure the fairness, explainability, and compliance of all your models.

Data replication

Prepare data

Replicate data to target systems with low latency, transactional integrity and optimized data capture.

Master data

Prepare data

Consolidate data from the disparate sources that fuel your business and establish a single, trusted, 360-degree view of your customers.