chaidnode properties

CHAID node iconThe CHAID node generates decision trees using chi-square statistics to identify optimal splits. Unlike the C&R Tree and Quest nodes, CHAID can generate non-binary trees, meaning that some splits have more than two branches. Target and input fields can be numeric range (continuous) or categorical. Exhaustive CHAID is a modification of CHAID that does a more thorough job of examining all possible splits but takes longer to compute.

Example

stream = modeler.script.stream()
sourcenode = stream.findByID("id46WRP1285C")

node = stream.createAt("chaid", "My node", 200, 100)
stream.link(sourcenode, node)

node.setPropertyValue("custom_fields", True)
node.setPropertyValue("target", "Drug")
node.setPropertyValue("inputs", ["Age", "Na", "K", "Cholesterol", "BP"])
node.setPropertyValue("use_model_name", True)
node.setPropertyValue("model_name", "CHAID")
node.setPropertyValue("method", "Chaid")
node.setPropertyValue("model_output_type", "InteractiveBuilder")
node.setPropertyValue("use_tree_directives", True)
node.setPropertyValue("tree_directives", "Test")
node.setPropertyValue("split_alpha", 0.03)
node.setPropertyValue("merge_alpha", 0.04)
node.setPropertyValue("chi_square", "Pearson")
node.setPropertyValue("use_percentage", False)
node.setPropertyValue("min_parent_records_abs", 40)
node.setPropertyValue("min_child_records_abs", 30)
node.setPropertyValue("epsilon", 0.003)
node.setPropertyValue("max_iterations", 75)
node.setPropertyValue("split_merged_categories", True)
node.setPropertyValue("bonferroni_adjustment", True)
Table 1. chaidnode properties
chaidnode Properties Values Property description
target field CHAID models require a single target and one or more input fields. You can also specify a frequency. See Common modeling node properties for more information.
continue_training_existing_model flag  
objective Standard
Boosting
Bagging
psm
psm is used for very large datasets, and requires a server connection.
model_output_type Single
InteractiveBuilder
 
use_tree_directives flag  
tree_directives string  
method Chaid
ExhaustiveChaid
 
use_max_depth Default
Custom
 
max_depth integer Maximum tree depth, from 0 to 1000. Used only if use_max_depth = Custom.
use_percentage flag  
min_parent_records_pc number  
min_child_records_pc number  
min_parent_records_abs number  
min_child_records_abs number  
use_costs flag  
costs structured Structured property.
trails number Number of component models for boosting or bagging.
set_ensemble_method Voting
HighestProbability
HighestMeanProbability
Default combining rule for categorical targets.
range_ensemble_method Mean
Median
Default combining rule for continuous targets.
large_boost flag Apply boosting to very large data sets.
split_alpha number Significance level for splitting.
merge_alpha number Significance level for merging.
bonferroni_adjustment flag Adjust significance values using Bonferroni method.
split_merged_categories flag Allow resplitting of merged categories.
chi_square Pearson
LR
Method used to calculate the chi-square statistic: Pearson or Likelihood Ratio
epsilon number Minimum change in expected cell frequencies..
max_iterations number Maximum iterations for convergence.
set_random_seed integer  
seed number  
calculate_variable_importance flag  
calculate_raw_propensities flag  
calculate_adjusted_propensities flag  
adjusted_propensity_partition Test
Validation
 
maximum_number_of_models integer  
train_pct double The algorithm internally separates records into a model building set and an overfit prevention set, which is an independent set of data records used to track errors during training in order to prevent the method from modeling chance variation in the data. Specify a percentage of records. The default is 30.