The Control Language for Expression Manipulation (CLEM) is a powerful language for analyzing and manipulating the data that streams through an SPSS Modeler flow. Data miners use CLEM extensively in flow operations to perform tasks as simple as deriving profit from cost and revenue data or as complex as transforming web log data into a set of fields and records with usable information.
CLEM is used within SPSS Modeler to:
- Compare and evaluate conditions on record fields
- Derive values for new fields
- Derive new values for existing fields
- Reason about the sequence of records
- Insert data from records into reports
CLEM expressions are indispensable for data preparation in SPSS Modeler and can be used in a wide range of nodes—from record and field operations (Select, Balance, Filler) to plots and output (Analysis, Report, Table). For example, you can use CLEM in a Derive node to create a new field based on a formula such as ratio.
CLEM expressions can also be used for global search and replace operations.
For example, the expression @NULL(@FIELD)
can be used in a Filler node to replace
system-missing values with the integer value 0. (To replace user-missing values, also
called blanks, use the @BLANK
function.)
More complex CLEM expressions can also be created. For example, you can derive
new fields based on a conditional set of rules, such as a new value category created by using the
following expressions: If: CardID = @OFFSET(CardID,1), Then: @OFFSET(ValueCategory,1), Else:
'exclude'
.
This example uses the @OFFSET
function to say: If the value
of the field CardID for a given record is the same as for the previous record, then return
the value of the field named ValueCategory for the previous record. Otherwise, assign the
string "exclude." In other words, if the CardIDs for adjacent records are the same, they
should be assigned the same value category. (Records with the exclude string can later be culled
using a Select node.)