Match Frequency stage

The Match Frequency stage generates the frequency distribution of values for columns in the input data. You use the frequency distribution and the input data in match jobs.

Match Specification. Optional. Click Browse to browse the assets for the match specification that you intend to use for matching with this input data. Selecting a match specification improves performance by restricting the generated frequency information to those fields that are used for matching in the match specification.

Do Not Use Match Specification check box. Select if you want to generate frequency information for all fields in the input data.

Link Type area. The following selections are available if you select a match specification for a two-source match:
  • Data. Select if the input columns comprise the data source for the two-source match.
  • Reference. Select if the input columns comprise the reference source for the two-source match.

Maximum Frequency Entry. Optional. Enter a value 101 - 100,000 to specify the maximum number of frequencies included in the output. The default is 100. The Match Designer prohibits fewer than 100 frequencies.

By default, DataStage® includes up to 100 entries in a frequency file. That means that for any column requiring frequency analysis, the 100 most frequent occurrences are included in the output. You might want to increase this number if you are processing large numbers of records.

Save. Click to close the stage and save changes.