Last updated: Jan 17, 2024
The Random Forest node uses an advanced implementation of a bagging algorithm with a tree model as the base model. This Random Forest modeling node in SPSS Modeler is implemented in Python and requires the scikit-learn© Python library.
rfnode properties |
Data type | Property description |
---|---|---|
custom_fields |
boolean | This option tells the node to use field information specified here instead of that given in any upstream Type node(s). After selecting this option, specify the following fields as required. |
inputs |
field | List of the field names for input. |
target |
field | One field name for target. |
fast_build |
boolean | Utilize multiple CPU cores to improve model building. |
role_use
|
string | Specify predefined to use predefined roles or custom to use
custom field assignments. Default is predefined. |
splits
|
field | List of the field names for split. |
n_estimators
|
integer | Number of trees to build. Default is 10 . |
specify_max_depth |
Boolean | Specify custom max depth. If false , nodes are expanded until all leaves are
pure or until all leaves contain less than min_samples_split samples. Default is
false . |
max_depth
|
integer | The maximum depth of the tree. Default is 10 . |
min_samples_leaf
|
integer | Minimum leaf node size. Default is 1 . |
max_features
|
string | The number of features to consider when looking for the best split:
auto . |
bootstrap
|
Boolean | Use bootstrap samples when building trees. Default is true . |
oob_score
|
Boolean | Use out-of-bag samples to estimate the generalization accuracy. Default value is
false . |
extreme
|
Boolean | Use extremely randomized trees. Default is false . |
use_random_seed
|
Boolean | Specify this to get replicated results. Default is false . |
random_seed
|
integer | The random number seed to use when build trees. Specify any integer. |
cache_size
|
float | The size of the kernel cache in MB. Default is 200 . |
enable_random_seed
|
Boolean | Enables the random_seed parameter. Specify true or false. Default is
false . |
enable_hpo
|
Boolean | Specify true or false to enable or disable the HPO options.
If set to true , Rbfopt will be applied to determine the "best" Random Forest model
automatically, which reaches the target objective value defined by the user with the following
target_objval parameter. |
target_objval
|
float | The objective function value (error rate of the model on the samples) you want to reach (for
example, the value of the unknown optimum). Set this parameter to the appropriate value if the
optimum is unknown (for example, 0.01 ). |
max_iterations |
integer | Maximum number of iterations for trying the model. Default is 1000 . |
max_evaluations |
integer | Maximum number of function evaluations for trying the model, where the focus is accuracy over
speed. Default is 300 . |