0 / 0
Handling records with system missing values
Last updated: Jan 17, 2024
Handling records with system missing values (SPSS Modeler)

What are system missing values?

System missing values represent data values that aren't known or not applicable. In databases, these values are often referred to as NULL values. System missing values are different from blank values. Blank values are typically defined in the Type node as particular values, or ranges of values, that can be regarded as user-defined-missing. Blank values are handled differently in the context of modeling.

Constructing system missing values

System missing values might be present in data that's read from a data source (for example, database tables might contain NULL values). System missing values can be constructed by using the value undef in expressions. For example, the following CLEM expression returns the Age, if less than or equal to 30, or a missing value if greater than 30:

if Age > 30 then undef else Age endif

Missing values can also be created when an outer join is carried out, when a number is divided by zero, when the square root of a negative number is computed, and in other situations.

Displaying system missing values

System missing values are displayed in tables and other output as $null$.

Testing for system missing values

Use the special function @NULL to return true if the argument value is a system missing value. For example:

if @NULL(MyFieldName) then 'It is null' else 'It is not null' endif

System missing values passed to functions

System missing values that are passed to functions usually propagate missing values to the output. For example, if the value of field f1 is a system missing value in a particular row, then the expression log(f1) also evaluates to a system missing value for that row. An exception is the @NULL function.

System missing values in expressions that involve arithmetic operators

Applying arithmetic operators to values that include a system missing value results in a system missing value. For example, if the value of field f1 is a system missing value in a particular row, then the expression f1 + 10 also evaluates to a system missing value for that row.

System missing values in expressions that involve logical operators

When you work with system missing values in expressions that involve logical operators, the rules of three-valued logic (true, false, and missing) apply and can be described in truth tables. The truth tables for the common logical operators of not, and, and or are shown in the following tables.

Table 1. Truth table for NOT
Operand NOT Operand
true false
false true
missing missing
Table 2. Truth table for AND
Operand1 Operand2 Operand1 AND Operand2
true true true
true false false
true missing missing
false true false
false false false
false missing false
missing true missing
missing false false
missing missing missing
Table 3. Truth table for OR
Operand1 Operand2 Operand1 OR Operand2
true true true
true false true
true missing true
false true true
false false false
false missing missing
missing true true
missing false missing
missing missing missing

System missing values in expressions that involve comparison operators

When you compare a system missing value and a non-system-missing value, the outcome evaluates to a system missing value rather than a true or false result. System missing values can be compared with each other; two system missing values are considered to be equal.

System missing values in if/then/else/endif expressions

When you use conditional expressions, and the conditional expression returns a system missing value, the value from the else clause is returned from the conditional expression.

System missing values in the Select node

When, for a particular record, the selection expression evaluates to a missing value, the record is not output from the Select node (this action applies to both Include and Discard modes).

System missing values in the Merge node

When you merge by using a key, any records that have system missing values in a key field are not merged.

System missing values in aggregation

When aggregating data on columns, missing values are not included in the calculation. For example, in a column with three values { 1, 2, and undef }, the sum of the values in the column is computed as 3; the mean value is computed as 1.5.