Last updated: Jan 18, 2024
The Auto Numeric node estimates and compares models for continuous numeric range outcomes using a number of different methods. The node works in the same manner as the Auto Classifier node, allowing you to choose the algorithms to use and to experiment with multiple combinations of options in a single modeling pass. Supported algorithms include neural networks, C&R Tree, CHAID, linear regression, generalized linear regression, and support vector machines (SVM). Models can be compared based on correlation, relative error, or number of variables used.
Example
node = stream.create("autonumeric", "My node")
node.setPropertyValue("ranking_measure", "Correlation")
node.setPropertyValue("ranking_dataset", "Training")
node.setPropertyValue("enable_correlation_limit", True)
node.setPropertyValue("correlation_limit", 0.8)
node.setPropertyValue("calculate_variable_importance", True)
node.setPropertyValue("neuralnetwork", True)
node.setPropertyValue("chaid", False)
autonumericnode Properties |
Values | Property description |
---|---|---|
custom_fields
|
flag | If True, custom field settings will be used instead of type node settings. |
target
|
field | The Auto Numeric node requires a single target and one or more input fields. Weight and frequency fields can also be specified. See Common modeling node properties for more information. |
inputs
|
[field1 … field2] | |
partition
|
field | |
use_frequency
|
flag | |
frequency_field
|
field | |
use_weight
|
flag | |
weight_field
|
field | |
use_partitioned_data
|
flag | If a partition field is defined, only the training data is used for model building. |
ranking_measure
|
Correlation
NumberOfFields
|
|
ranking_dataset
|
Test
Training
|
|
number_of_models
|
integer | Number of models to include in the model nugget. Specify an integer between 1 and 100. |
calculate_variable_importance
|
flag | |
enable_correlation_limit
|
flag | |
correlation_limit
|
integer | |
enable_number_of_fields_limit
|
flag | |
number_of_fields_limit
|
integer | |
enable_relative_error_limit
|
flag | |
relative_error_limit
|
integer | |
enable_model_build_time_limit
|
flag | |
model_build_time_limit
|
integer | |
enable_stop_after_time_limit
|
flag | |
stop_after_time_limit
|
integer | |
stop_if_valid_model
|
flag | |
<algorithm>
|
flag | Enables or disables the use of a specific algorithm. |
<algorithm>.<property>
|
string | Sets a property value for a specific algorithm. See Setting algorithm properties for more information. |
use_cross_validation |
boolean | Instead of using a single partition, a cross validation partition is used. |
number_of_folds |
integer | N fold parameter for cross validation, with range from 3 to 10. |
set_random_seed |
boolean | Setting a random seed allows you to replicate analyses. Specify an integer or click Generate, which will create a pseudo-random integer between 1 and 2147483647, inclusive. By default, analyses are replicated with seed 229176228. |
random_seed |
integer | Random seed |
filter_individual_model_output |
boolean | Removes from the output all of the additional fields generated by the individual models that feed into the Ensemble node. Select this option if you're interested only in the combined score from all of the input models. Ensure that this option is deselected if, for example, you want to use an Analysis node or Evaluation node to compare the accuracy of the combined score with that of each of the individual input models. |
calculate_standard_error |
boolean | For a continuous (numeric range) target, a standard error calculation runs by default to calculate the difference between the measured or estimated values and the true values; and to show how close those estimates matched. |