About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Last updated: Feb 11, 2025
You can use Select nodes to select or discard a subset of records from the data stream based on a specific condition, such as BP (blood pressure) = "HIGH".
Mode. Specifies whether records that meet the condition will be included or excluded from the data stream.
- Include. Select to include records that meet the selection condition.
- Discard. Select to exclude records that meet the selection condition.
Condition. Displays the selection condition that will be used to test each record, which you specify using a CLEM expression. Either enter an expression in the window or use the Expression Builder by clicking the calculator (Expression Builder) button.
If you choose to discard records based on a condition, such as the following:
(var1='value1' and var2='value2')
the Select node by default also discards records having null values for all selection fields. To avoid this, append the following condition to the original one:
and not(@NULL(var1) and @NULL(var2))
Select nodes are also used to choose a proportion of records. Typically, you would use a different node, the Sample node, for this operation. However, if the condition you want to specify is more complex than the parameters provided, you can create your own condition using the Select node. For example, you can create a condition such as:
BP = "HIGH" and random(10) <= 4
This will select approximately 40% of the records showing high blood pressure and pass those records downstream for further analysis.