About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Last updated: Nov 07, 2024
Use the Partitioning section in DataStage® stages or connectors that have Input tabs to specify details about how the stage or connector partitions or collects data on the current link before it processes the data or writes it to a data target.
Data partitioning is an approach to parallelism that involves breaking the record set into partitions, or subsets of records. If no resource constraints or other data skew issues exist, data partitioning can provide linear increases in application performance. DataStage automatically partitions data based on the type of partition that the stage requires.
You can also use the Partitioning section to sort data that is arriving on the input link before the data is processed or written to the data target. The availability of sorting depends on the partitioning or collecting method that is chosen. It is not available with the Auto methods. The Partitioning section provides basic sorting facilities. For a more complex sort operation, use the Sort stage.
The Partitioning section contains the following controls and fields:
- Partitioning
- Choose the partitioning type from the list.
- Collecting
- Choose the collecting type from the list.
- Sorting
- Use these controls to specify how to sort the data. Data is always sorted within data
partitions. If the stage is partitioning incoming data, the data is sorted after the partitioning.
If the stage is collecting incoming data, the data is sorted before the collection.
- Sort
- Select Perform sort to sort data that comes in on the link.
- Stable
- Select Stable if you want to preserve previously sorted data sets. Stable is set by default.
- Unique
- Select Unique if you want to retain only one record per sorting key value. If multiple records have identical sorting key values, all but one is discarded. If stable sort is also set, the first record with the sorting key value is the record that is retained.
Dynamically generated configuration files in DataStage-SaaS
DataStage-SaaS does not support
user-generated configuration files. You can provide the number of partitions for dynamically
generated configuration files by setting the partition count in the runtime environment or by
setting the environment variable
for the number of
partitions.APT_WLM_PARTITION_COUNT
Was the topic helpful?
0/1000