Aggregator stage: Calculation and recalculation dependent properties (DataStage®)
Some properties are dependents of both Column for Calculation and Summary Column for Recalculation.
These specify the various aggregate functions and the output columns to carry the results.
- Corrected Sum of Squares
Produces a corrected sum of squares for data in the aggregate column and outputs it to the specified output column.
- Maximum Value
Gives the maximum value in the aggregate column and outputs it to the specified output column.
- Mean Value
Gives the mean value in the aggregate column and outputs it to the specified output column.
- Minimum Value
Gives the minimum value in the aggregate column and outputs it to the specified output column.
- Missing Value
This specifies what constitutes a "missing" value, for example -1 or NULL. Enter the value as a floating point number. Not available for Summary Column to Recalculate.
- Missing Values Count
Counts the number of aggregate columns with missing values in them and outputs the count to the specified output column. Not available for recalculate.
- Non-missing Values Count
Counts the number of aggregate columns with values in them and outputs the count to the specified output column.
- Percent Coefficient of Variation
Calculates the percent coefficient of variation for the aggregate column and outputs it to the specified output column.
- Range
Calculates the range of values in the aggregate column and outputs it to the specified output column.
- Standard Deviation
Calculates the standard deviation of values in the aggregate column and outputs it to the specified output column.
- Standard Error
Calculates the standard error of values in the aggregate column and outputs it to the specified output column.
- Sum of Weights
Calculates the sum of values in the weight column specified by the Weight column property and outputs it to the specified output column.
- Sum
Sums the values in the aggregate column and outputs the sum to the specified output column.
- Summary
Specifies a subrecord to write the results of the calculate or recalculate operation to.
- Uncorrected Sum of Squares
Produces an uncorrected sum of squares for data in the aggregate column and outputs it to the specified output column.
- Variance
Calculates the variance for the aggregate column and outputs the sum to the specified output column. This has a dependent property:
- Variance divisor
Specifies the variance divisor. By default, uses a value of the number of records in the group minus the number of records with missing values minus 1 to calculate the variance. This corresponds to a vardiv setting of Default. If you specify NRecs, IBM DataStage uses the number of records in the group minus the number of records with missing values instead.
- Variance divisor
Each of these properties has a dependent property as follows:
- Decimal Output
By default all calculation or recalculation columns have an output type of double. This property allows you to specify that columns have an output type of decimal.
When you specify the decimal output, you can also specify precision and scale. Precision is the number of digits in a number. Scale is the number of digits to the right of the decimal point in a number. The default is
8,2
.In cases where the required output scale is low, set the precision and scale to p+4, s+4 to get accurate results. If a column has a precision and scale of
4,1
, then in the decimal data type, set the precision and scale to9,5
.For example, a column that has the values:
You can use decimal type for intermediate calculations of the different reduce options. The decimal precision and scale should set large enough to avoid rounding of intermediate calculations. For example, if you are calculating the mean value of a decimal of size precision 8 and scale 2, then the intermediate decimal size should be set to at least precision 10 and scale 4." 004.0"," 010.0"," 004.0"," 006.0"," 010.0"," 008.0"," 009.0"," 007.0" " 010.0"," 007.0"," 010.0"," 007.0"," 010.0"
. The precision value for the column is4
and the scale value is1
. The output is calculated as 7.8 if the precision and scale is set to9,5
. But if the precision and scale is set to4,1
, the output is 7.9. The more accurate calculation is 7.8.