0 / 0
DataStage on Cloud Pak for Data as a Service

DataStage on Cloud Pak for Data as a Service



IBM DataStage is a data integration tool for designing, developing, and running jobs that move and transform data.

DataStage is one of the data integration components of Cloud Pak for Data. The DataStage service is fully integrated into Cloud Pak for Data as a Service as part of the data fabric. It provides a graphical framework for developing the jobs that move data from source systems to target systems. The transformed data can be delivered to data warehouses, data marts, and operational data stores, real-time web services and messaging systems, and other enterprise applications. DataStage supports extract, transform, and load (ETL) and extract, load, and transform (ELT) patterns. DataStage uses parallel processing and enterprise connectivity to provide a truly scalable platform.

DataStage is part of Cloud Pak for Data as a Service and provides the data integration capabilities of the data fabric architecture.

A diagram depicting how DataStage fits into the service architecture for Cloud Pak for Data as a Service.

With the DataStage parallel engine (PX) remote runtime as-a-service, you can run jobs in IBM Cloud and on prebuilt remote locations that are managed by IBM. By using a remote location as your environment, you can fully or partially eliminate the need to move or copy data from other public clouds. By bringing workloads to the data’s location, you improve performance, satisfy data residency requirements, and incur lower data transfer costs.

With DataStage, your company can accomplish these goals:

  • Design data flows that extract information from multiple source systems, transform the data as required, and deliver the data to target databases or applications.
  • Connect directly to enterprise applications as sources or targets to ensure that the data is relevant, complete, and accurate.
  • Reduce development time and improve the consistency of design and deployment by using prebuilt functions.
  • Minimize the project delivery cycle by working with a common set of tools across Watson Studio.

This service adds a tool in projects.

Quick links

Integrated services

Table 1. Related services. The following related services are often used with this service and provide complementary features, but they are not required.
Service Capability
IBM® Knowledge Catalog Create catalogs of curated assets with this secure enterprise catalog management platform that is supported by a data governance framework.
Watson™ Studio Prepare, analyze, and model data in a collaborative environment with tools for data scientists, developers, and domain experts.
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more