Combine Records stage in DataStage

Combine Records stage

The Combine Records stage is a restructure stage. This stage combines records (that is, rows), in which particular key-column values are identical, into vectors of subrecords.

The Combine Records stage can have a single input link and a single output link.

The Combine Records stage combines records (that is, rows), in which particular key-column values are identical, into vectors of subrecords. As input, the stage takes a data set in which one or more columns are chosen as keys. All adjacent records whose key columns contain the same value are gathered into the same record in the form of subrecords.

Shows columns being combined into a vector of subrecords

The data set input to the Combine Records stage must be key partitioned and sorted, which ensures that rows with the same key column values are located in the same partition and will be processed by the same node. Choosing the (auto) partitioning method ensures that partitioning and sorting is done. If sorting and partitioning are carried out on separate stages before the Combine Records stage, DataStage® in auto mode will detect this and not repartition (alternatively you could explicitly specify the Same partitioning method).

The stage editor has three tabs:

Stage tab. This tab is always present and is used to specify general information about the stage.
Input tab. This tab is where you specify the details about the single input set from which you are selecting records.
Output tab. This tab is where you specify details about the processed data being output from the stage.