Sequence functions (SPSS Modeler)

Sequence functions

For some operations, the sequence of events is important.

The application allows you to work with the following record sequences:

Sequences and time series
Sequence functions
Record indexing
Averaging, summing, and comparing values
Monitoring change—differentiation
@SINCE
Offset values
Additional sequence facilities

For many applications, each record passing through a stream can be considered as an individual case, independent of all others. In such situations, the order of records is usually unimportant.

For some classes of problems, however, the record sequence is very important. These are typically time series situations, in which the sequence of records represents an ordered sequence of events or occurrences. Each record represents a snapshot at a particular instant in time; much of the richest information, however, might be contained not in instantaneous values but in the way in which such values are changing and behaving over time.

Of course, the relevant parameter may be something other than time. For example, the records could represent analyses performed at distances along a line, but the same principles would apply.

Sequence and special functions are immediately recognizable by the following characteristics:

They are all prefixed by @
Their names are given in uppercase

Sequence functions can refer to the record currently being processed by a node, the records that have already passed through a node, and even, in one case, records that have yet to pass through a node. Sequence functions can be mixed freely with other components of CLEM expressions, although some have restrictions on what can be used as their arguments.

Examples

You may find it useful to know how long it has been since a certain event occurred or a condition was true. Use the function @SINCE to do this—for example:

@SINCE(Income > Outgoings)

This function returns the offset of the last record where this condition was true—that is, the number of records before this one in which the condition was true. If the condition has never been true, @SINCE returns @INDEX + 1.

Sometimes you may want to refer to a value of the current record in the expression used by @SINCE. You can do this using the function @THIS, which specifies that a field name always applies to the current record. To find the offset of the last record that had a Concentration field value more than twice that of the current record, you could use:

@SINCE(Concentration > 2 * @THIS(Concentration))

In some cases the condition given to @SINCE is true of the current record by definition—for example:

@SINCE(ID == @THIS(ID))

For this reason, @SINCE doesn't evaluate its condition for the current record. Use a similar function, @SINCE0, if you want to evaluate the condition for the current record as well as previous ones; if the condition is true in the current record, @SINCE0 returns 0.

Table 1. CLEM sequence functions
Function	Result	Description
`MEAN(FIELD)`	Real	Returns the mean average of values for the specified FIELD or FIELDS.
`@MEAN(FIELD, EXPR)`	Real	Returns the mean average of values for FIELD over the last EXPR records received by the current node, including the current record. FIELD must be the name of a numeric field. EXPR may be any expression evaluating to an integer greater than 0. If EXPR is omitted or if it exceeds the number of records received so far, the average over all of the records received so far is returned.
`@MEAN(FIELD, EXPR, INT)`	Real	Returns the mean average of values for FIELD over the last EXPR records received by the current node, including the current record. FIELD must be the name of a numeric field. EXPR may be any expression evaluating to an integer greater than 0. If EXPR is omitted or if it exceeds the number of records received so far, the average over all of the records received so far is returned. INT specifies the maximum number of values to look back. This is far more efficient than using just two arguments.
`@DIFF1(FIELD)`	Real	Returns the first differential of FIELD. The single-argument form thus simply returns the difference between the current value and the previous value of the field. Returns $null$ if the relevant previous records do not exist.
`@DIFF1(FIELD1, FIELD2)`	Real	The two-argument form gives the first differential of FIELD1 with respect to FIELD2. Returns $null$ if the relevant previous records do not exist. It is calculated as `@DIFF1(FIELD1)/@DIFF1(FIELD2)`.
`@DIFF2(FIELD)`	Real	Returns the second differential of FIELD. The single-argument form thus simply returns the difference between the current value and the previous value of the field. Returns $null$ if the relevant previous records do not exist. `@DIFF2` is calculated as `@DIFF(@DIFF(FIELD))`.
`@DIFF2(FIELD1, FIELD2)`	Real	The two-argument form gives the second differential of FIELD1 with respect to FIELD2. Returns $null$ if the relevant previous records do not exist. This is a complex calculation -- `@DIFF1(FIELD1)/@DIFF1(FIELD2) - @OFFSET(@DIFF1(FIELD1),1)/@OFFSET(@DIFF1(FIELD2))) / @DIFF1(FIELD2)`.
`@INDEX`	Integer	Returns the index of the current record. Indices are allocated to records as they arrive at the current node. The first record is given index 1, and the index is incremented by 1 for each subsequent record.
`@LAST_NON_BLANK(FIELD)`	Any	Returns the last value for FIELD that was not blank, as defined in an upstream source or Type node. If there are no nonblank values for FIELD in the records read so far, $null$ is returned. Note that blank values, also called user-missing values, can be defined separately for each field.
`@MAX(FIELD)`	Number	Returns the maximum value for the specified FIELD.
`@MAX(FIELD, EXPR)`	Number	Returns the maximum value for FIELD over the last EXPR records received so far, including the current record. FIELD must be the name of a numeric field. EXPR may be any expression evaluating to an integer greater than 0.
`@MAX(FIELD, EXPR, INT)`	Number	Returns the maximum value for FIELD over the last EXPR records received so far, including the current record. FIELD must be the name of a numeric field. EXPR may be any expression evaluating to an integer greater than 0. If EXPR is omitted, or if it exceeds the number of records received so far, the maximum value over all of the records received so far is returned. INT specifies the maximum number of values to look back. This is far more efficient than using just two arguments.
`@MIN(FIELD)`	Number	Returns the minimum value for the specified FIELD.
`@MIN(FIELD, EXPR)`	Number	Returns the minimum value for FIELD over the last EXPR records received so far, including the current record. FIELD must be the name of a numeric field. EXPR may be any expression evaluating to an integer greater than 0.
`@MIN(FIELD, EXPR, INT)`	Number	Returns the minimum value for FIELD over the last EXPR records received so far, including the current record. FIELD must be the name of a numeric field. EXPR may be any expression evaluating to an integer greater than 0. If EXPR is omitted, or if it exceeds the number of records received so far, the minimum value over all of the records received so far is returned. INT specifies the maximum number of values to look back. This is far more efficient than using just two arguments.
`@OFFSET(FIELD, EXPR)`	Any	Returns the value of FIELD in the record offset from the current record by the value of EXPR. A positive offset refers to a record that has already passed (a "lookback"), while a negative one specifies a "lookahead" to a record that has yet to arrive. For example, `@OFFSET(Status, 1)` returns the value of the `Status` field in the previous record, while `@OFFSET(Status, –4)` "looks ahead" four records in the sequence (that is, to records that have not yet passed through this node) to obtain the value. Note that a negative (look ahead) offset must be specified as a constant. For positive offsets only, EXPR may also be an arbitrary CLEM expression, which is evaluated for the current record to give the offset. In this case, the three-argument version of this function should improve performance (see next function). If the expression returns anything other than a non-negative integer, this causes an error—that is, it is not legal to have calculated lookahead offsets. Note that a self-referential `@OFFSET` function can't use literal lookahead. For example, in a Filler node, you can't replace the value of `field1` using an expression such as `@OFFSET(field1,-2)`. In the Filler node, when filling a field, there are effectively two different values of that field, namely the pre-filled value and the post-filled value. When `@OFFSET` refers to itself it refers to the post-filled value. This post-filled value only exists for past rows so self referential `@OFFSET` can only refer to past rows. Since self referential `@OFFSET` can't refer to the future, it carries out the following checks of the offset: If the offset is literal and into the future, an error is reported before execution begins. If the offset is an expression and evaluates to the future at runtime then `@OFFSET` returns $null$ . Note that using both "lookahead" and "lookback" within one node isn't supported.
`@OFFSET(FIELD, EXPR, INT)`	Any	Performs the same operation as the `@OFFSET` function with the addition of a third argument, INT, which specifies the maximum number of values to look back. In cases where the offset is computed from an expression, this third argument should improve performance. For example, in an expression such as`@OFFSET(Foo, Month, 12)`, the system knows to keep only the last twelve values of `Foo`; otherwise, it has to store every value just in case. In cases where the offset value is a constant—including negative "lookahead" offsets, which must be constant—the third argument is pointless and the two-argument version of this function should be used. See also the note about self-referential functions in the two-argument version described earlier. Note that using both "lookahead" and "lookback" within one node isn't supported.
`@SDEV(FIELD)`	Real	Returns the standard deviation of values for the specified FIELD or FIELDS.
`@SDEV(FIELD, EXPR)`	Real	Returns the standard deviation of values for FIELD over the last EXPR records received by the current node, including the current record. FIELD must be the name of a numeric field. EXPR may be any expression evaluating to an integer greater than 0. If EXPR is omitted, or if it exceeds the number of records received so far, the standard deviation over all of the records received so far is returned.
`@SDEV(FIELD, EXPR, INT)`	Real	Returns the standard deviation of values for FIELD over the last EXPR records received by the current node, including the current record. FIELD must be the name of a numeric field. EXPR may be any expression evaluating to an integer greater than 0. If EXPR is omitted, or if it exceeds the number of records received so far, the standard deviation over all of the records received so far is returned. INT specifies the maximum number of values to look back. This is far more efficient than using just two arguments.
`@SINCE(EXPR)`	Any	Returns the number of records that have passed since EXPR, an arbitrary CLEM expression, was true.
`@SINCE(EXPR, INT)`	Any	Adding the second argument, INT, specifies the maximum number of records to look back. If EXPR has never been true, INT is `@INDEX+1`.
`@SINCE0(EXPR)`	Any	Considers the current record, while `@SINCE` does not; `@SINCE0` returns 0 if EXPR is true for the current record.
`@SINCE0(EXPR, INT)`	Any	Adding the second argument, INT specifies the maximum number of records to look back.
`@SUM(FIELD)`	Number	Returns the sum of values for the specified FIELD or FIELDS.
`@SUM(FIELD, EXPR)`	Number	Returns the sum of values for FIELD over the last EXPR records received by the current node, including the current record. FIELD must be the name of a numeric field. EXPR may be any expression evaluating to an integer greater than 0. If EXPR is omitted, or if it exceeds the number of records received so far, the sum over all of the records received so far is returned.
`@SUM(FIELD, EXPR, INT)`	Number	Returns the sum of values for FIELD over the last EXPR records received by the current node, including the current record. FIELD must be the name of a numeric field. EXPR may be any expression evaluating to an integer greater than 0. If EXPR is omitted, or if it exceeds the number of records received so far, the sum over all of the records received so far is returned. INT specifies the maximum number of values to look back. This is far more efficient than using just two arguments.
`@THIS(FIELD)`	Any	Returns the value of the field named FIELD in the current record. Used only in `@SINCE` expressions.