0 / 0
Remove Duplicates stage: Stage tab (DataStage)

Remove Duplicates stage: Stage tab (DataStage)

You can specify aspects of the Modify stage by double-clicking the stage and updating settings on the Stage tab.

Double-click the Remove Duplicates stage to open the stage editor. On the Stage tab, the Properties section lets you specify what the stage does. The Advanced section allows you to specify how the stage executes.

Properties

You can specify the following properties:
Table 1. Properties
Category/Property Values Default Mandatory? Repeats? Dependent of
Keys that Define Duplicates/Key Input Column N/A Y Y N/A
Keys that Define Duplicates/Case sensitive True/False True N N Key
Keys that Define Duplicates/Sort as EBCDIC True/False False N N Key
Options/Duplicate to retain First/Last First Y N N/A

Key

Specifies the key column for the operation. This property can be repeated to specify multiple key columns. You can use the Column Selection dialog box to select several keys at once if required. Key has dependent properties as follows:

  • Case Sensitive

    Use this to specify whether each key is case sensitive or not, this is set to True by default, that is, the values "CASE" and "case" would not be judged equivalent.

  • Sort as EBCDIC

    To sort as in the EBCDIC character set, choose True.

Duplicate to retain

Specifies which of the duplicate columns you want to retain. Choose between First and Last. It is set to First by default.

Advanced

This section allows you to specify the following:

  • Execution Mode. The stage can execute in parallel mode or sequential mode. In parallel mode the input data is processed by the available nodes as specified in the Configuration file, and by any node constraints specified on the Advanced section. In Sequential mode the entire data set is processed by the conductor node.
  • Combinability mode. This is Auto by default, which allows IBM® DataStage® to combine the operators that underlie parallel stages so that they run in the same process if it is sensible for this type of stage.
  • Preserve partitioning. This is Set by default. You can explicitly select Set or Clear. Select Set to request the next stage should attempt to maintain the partitioning.
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more