Sample stage
The Sample stage samples an input data set.
The Sample stage can have a single input link and any number of output links when operating in percent mode, or a single input and single output link when operating in period mode. It is one of a number of stages that IBM DataStage provides to help you sample data, see also:
- Head stage, Head stage.
- Tail stage, Tail stage.
- Peek stage, Peek stage.
The Sample stage is a debug stage. It operates in two modes. In Percent mode, it extracts rows, selecting them by means of a random number generator, and writes a given percentage of these to each output data set. You specify the number of output data sets, the percentage written to each, and a seed value to start the random number generator. You can reproduce a given distribution by repeating the same number of outputs, the percentage, and the seed value.
In Period mode, it extracts every Nth row from each partition, where N is the period, which you supply. In this case all rows will be output to a single data set, so the stage used in this mode can only have a single output link
For both modes you can specify the maximum number of rows that you want to sample from each partition.