0 / 0
Investigate stage in DataStage: Stage tab

Investigate stage: Stage tab

You can specify aspects of the Investigate stage by double-clicking the stage and updating settings on the Stage tab.

The Properties section lets you specify what the stage does. The Advanced section allows you to specify how the stage executes.

Properties

Use the Properties section to define what the stage actually does.
Alternate locale
Optional. Lets you specify the international locale you want to use on the server to process the data.

This value needs to be set only if you are processing data for a language that is not the default language of the server. For example, the default language for your server is French and the data to be processed is Italian.

When you change the locale, InfoSphere QualityStage uses the appropriate collating sequence and decimal separators for the alternate language. The value required depends on the type of server and how it is configured.

If you are using a UNIX server, enter the following command to obtain a list of locales supported by your server:

locale -a

If you are using a Windows workstation, select your InfoSphere QualityStage server directory and the locale subdirectory. The local subdirectory contains folders that are listed alphabetically by the languages they support.

Investigation

Investigation type

Character . A character investigation analyzes and classifies data, parsing it into a single-pattern report.

Column Investigation selection
Click Edit to apply a column mask. You use column masks to choose which characters are included in the frequency count or pattern analysis and which characters are displayed as part of the samples in the pattern report.
Frequency cutoff
Patterns with a frequency of less than this number will not appear in the pattern or token reports. If desired, enter a higher number. For example, if you enter 4, any pattern that occurs three times or less does not appear in the report.
Number of samples
If desired, increase the number of samples that appear for each pattern in the pattern report. The default is 1.
Comparison Mode: Concatenate
Performs cross-column correlations between multiple columns to determine relationships. You can choose two noncontiguous columns from anywhere in the record to be investigated as a single data column.

Advanced

This section let's you specify the following:
Execution mode
The stage can execute in parallel mode or sequential mode. In parallel mode the input data is processed by the available nodes as specified in the Configuration file, and by any node constraints specified on the Advanced tab. In Sequential mode the entire data set is processed by the conductor node.
Combinability mode
This is Auto by default, which allows IBM® DataStage® to combine the operators that underlie parallel stages so that they run in the same process if it is sensible for this type of stage.
Preserver partitioning
This is Propagate by default. It adopts Set or Clear from the previous stage. You can explicitly select Set or Clear. Select Set to request that next stage in the job should attempt to maintain the partitioning.
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more