One of the most powerful features in watsonx.ai is the ability to modify data values and derive
new fields from existing data. During lengthy data mining projects, it is common to perform several
derivations, such as extracting a customer ID from a string of Web log data or creating a customer
lifetime value based on transaction and demographic data. All of these transformations can be
performed, using a variety of field operations nodes.
Several nodes provide the ability to derive new fields:
The Derive node modifies data values or creates new
fields from one or more existing fields. It creates fields of type formula, flag, nominal, state,
count, and conditional.
The Reclassify node transforms one set of categorical
values to another. Reclassification is useful for collapsing categories or regrouping data for
analysis.
The Binning node automatically creates new nominal
(set) fields based on the values of one or more existing continuous (numeric range) fields. For
example, you can transform a continuous income field into a new categorical field containing groups
of income as deviations from the mean. After you create bins for the new field, you can generate a
Derive node based on the cut points.
The Set to Flag node derives multiple flag fields
based on the categorical values defined for one or more nominal fields.
The Restructure node converts a nominal or flag field
into a group of fields that can be populated with the values of yet another field. For example,
given a field named payment type, with values of credit,
cash, and debit, three new fields would be created
(credit, cash, debit), each of which might
contain the value of the actual payment made.
Using the Derive node
Copy link to section
Using the Derive node, you can create six types of new fields from one or
more existing fields:
Formula. The new field is the result of an arbitrary
CLEM expression.
Flag. The new field is a flag, representing a
specified condition.
Nominal. The new field is nominal, meaning that its
members are a group of specified values.
State. The new field is one of two states. Switching
between these states is triggered by a specified condition.
Count. The new field is based on the number of times
that a condition has been true.
Conditional. The new field is the value of one of two
expressions, depending on the value of a condition.
Each of these nodes contains a set of special options. These options are
discussed in subsequent topics.
Note that use of the following may change row order:
Executing in a database via SQL pushback
Executing via remote Analytic Server
Using functions that run in embedded Analytic Server
Deriving a list
Calling spatial functions
Tip: The Control Language for Expression Manipulation (CLEM) is a powerful tool you can
use to analyze and manipulate the data used in your flows. For example, you might use CLEM in a node
to derive values. For more information, see the CLEM (legacy) language reference.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.