Handling fields with missing values (SPSS Modeler)
If the majority of missing values are concentrated in a small number of fields, you can
address them at the field level rather than at the record level. This approach also allows you to
experiment with the relative importance of particular fields before deciding on an approach for
handling missing values. If a field is unimportant in modeling, it probably isn't worth keeping,
regardless of how many missing values it has.
For example, a market research company may collect data from a general questionnaire containing
50 questions. Two of the questions address age and political persuasion, information that many
people are reluctant to give. In this case, Age and
Political_persuasion have many missing values.
Field measurement level
Copy link to section
In determining which method to use, you should also consider the measurement level of fields with
missing values.
Numeric fields. For numeric field types, such as
Continuous, you should always eliminate any non-numeric values before building a
model, because many models won't function if blanks are included in numeric fields.
Categorical fields. For categorical fields, such as
Nominal and Flag, altering missing values isn't necessary but will
increase the accuracy of the model. For example, a model that uses the field Sex
will still function with meaningless values, such as Y and Z, but
removing all values other than M and F will increase the accuracy
of the model.
Screening or removing fields
Copy link to section
To screen out fields with too many missing values, you have several options:
You can use a Data Audit node to filter fields based on quality
You can use a Feature Selection node to screen out fields with more than a
specified percentage of missing values and to rank fields based on importance relative to a
specified target
Instead of removing the fields, you can use a Type node to set the field role to
None. This will keep the fields in the data set but exclude them from the
modeling processes
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.