The CHAID node generates decision trees by using chi-square statistics to identify
optimal splits. Unlike the C&R Tree and Quest nodes, CHAID can generate nonbinary trees, meaning
that some splits have more than two branches. Target and input fields can be numeric range
(continuous) or categorical. Exhaustive CHAID is a modification of CHAID that does a more thorough
job of examining all possible splits but takes longer to compute.
CHAID models require a single target and one or more input fields. You can also specify a
frequency. For more information, see Common modeling node properties.
continue_training_existing_model
flag
objective
Standard
Boosting
Bagging
psm
psm is used for large datasets, and requires a server connection.
model_output_type
Single
InteractiveBuilder
use_tree_directives
flag
tree_directives
string
method
Chaid
ExhaustiveChaid
use_max_depth
Default
Custom
max_depth
integer
Maximum tree depth, from 0 to 1000. Used only if use_max_depth =
Custom.
use_percentage
flag
min_parent_records_pc
number
min_child_records_pc
number
min_parent_records_abs
number
min_child_records_abs
number
use_costs
flag
costs
structured
Structured property.
trails
number
Number of component models for boosting or bagging.
set_ensemble_method
Voting
HighestProbability
HighestMeanProbability
The default rule for combining categorical targets.
range_ensemble_method
Mean
Median
Default combining rule for continuous targets.
large_boost
flag
Applies boosting for large data sets.
split_alpha
number
Significance level for splitting.
merge_alpha
number
Significance level for merging.
bonferroni_adjustment
flag
Adjust significance values by using the Bonferroni method.
split_merged_categories
flag
Allow resplitting of merged categories.
chi_square
Pearson
LR
The method used to calculate the chi-square statistic: Pearson or Likelihood Ratio
epsilon
number
Minimum change in expected cell frequencies..
max_iterations
number
Maximum iterations for convergence.
set_random_seed
integer
seed
number
calculate_variable_importance
flag
calculate_raw_propensities
flag
calculate_adjusted_propensities
flag
adjusted_propensity_partition
Test
Validation
maximum_number_of_models
integer
train_pct
double
The algorithm internally separates records into a model building set and an overfit
prevention set. The overfit prevention set is an independent set of data records used to track
errors during training, which prevents the method from modeling chance variation in the data.
Specify a percentage of records. The default is 30.
use_customize_layer
Boolean
The default value is false. You can set this property to
true if you want to designate specific fields as points to split the decision tree
at.
customize_layer
list
This property is used only when use_customize_layer is set to
true.
This property is a list of objects. Each of the objects has two attributes:
Layer is an integer that indicates the specific n-th layer in the decision tree
that you want to customize. In SPSS Modeler, layers start from
0 (root).
Fields is a list of names. Each name is one of the fields that you want the
decision tree to potentially split on for that Layer. These fields are evaluated by
SPSS Modeler in the order that they are listed.
When the SPSS Modeler flow runs, the CHAID algorithm
evaluates and returns a candidate list of fields to split at based on the p value
for each layer. For a custom layer, each field that you specified for the layer is compared to the
full candidate list of fields. The first field to match a field from the candidate list is used for
the split. The rest of the specified fields are ignored. If none of the fields match, a warning
message appears and the tree splits as normal.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.