Transform node

Normalizing input fields is an important step before using traditional scoring techniques such as regression, logistic regression, and discriminant analysis. These techniques carry assumptions about normal distributions of data that may not be true for many raw data files. One approach to dealing with real-world data is to apply transformations that move a raw data element toward a more normal distribution. In addition, normalized fields can easily be compared with each other—for example, income and age are on totally different scales in a raw data file but, when normalized, the relative impact of each can be easily interpreted.

The Transform node provides an output viewer that enables you to perform a rapid visual assessment of the best transformation to use. You can see at a glance whether variables are normally distributed and, if necessary, choose the transformation you want and apply it. You can pick multiple fields and perform one transformation per field.

After selecting the preferred transformations for the fields, you can generate Derive or Filler nodes that perform the transformations and attach these nodes to the stream. The Derive node creates new fields, while the Filler node transforms the existing ones.

Transform node fields settings

Under the FIELDS section in the node properties, you can specify which fields of the data you want to use for viewing possible transformations and applying them. Only numeric fields can be transformed. Select one or more numeric fields.