The Feature Selection node screens
input fields for removal based on a set of criteria (such as the percentage of missing values); it
then ranks the importance of remaining inputs relative to a specified target. For example, given a
data set with hundreds of potential inputs, which are most likely to be useful in modeling patient
outcomes?
Feature Selection models rank predictors relative to the specified target. Weight and
frequency fields are not used. See Common modeling node properties for
more information.
screen_single_category
flag
If True, screens fields that have too many records falling into the same
category relative to the total number of records.
max_single_category
number
Specifies the threshold used when screen_single_category is
True.
screen_missing_values
flag
If True, screens fields with too many missing values, expressed as a
percentage of the total number of records.
max_missing_values
number
screen_num_categories
flag
If True, screens fields with too many categories relative to the total
number of records.
max_num_categories
number
screen_std_dev
flag
If True, screens fields with a standard deviation of less than or equal to
the specified minimum.
min_std_dev
number
screen_coeff_of_var
flag
If True, screens fields with a coefficient of variance less than or equal to
the specified minimum.
min_coeff_of_var
number
criteria
PearsonLikelihoodCramersVLambda
When ranking categorical predictors against a categorical target, specifies the measure on
which the importance value is based.
unimportant_below
number
Specifies the threshold p values used to rank variables as important, marginal, or
unimportant. Accepts values from 0.0 to 1.0.
important_above
number
Accepts values from 0.0 to 1.0.
unimportant_label
string
Specifies the label for the unimportant ranking.
marginal_label
string
important_label
string
selection_mode
ImportanceLevelImportanceValueTopN
select_important
flag
When selection_mode is set to ImportanceLevel, specifies
whether to select important fields.
select_marginal
flag
When selection_mode is set to ImportanceLevel, specifies
whether to select marginal fields.
select_unimportant
flag
When selection_mode is set to ImportanceLevel, specifies
whether to select unimportant fields.
importance_value
number
When selection_mode is set to ImportanceValue, specifies
the cutoff value to use. Accepts values from 0 to 100.
top_n
integer
When selection_mode is set to TopN, specifies the cutoff
value to use. Accepts values from 0 to 1000.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.