rfnode properties

Random Forest node iconThe Random Forest node uses an advanced implementation of a bagging algorithm with a tree model as the base model. This Random Forest modeling node in SPSS Modeler is implemented in Python and requires the scikit-learn© Python library.

Table 1. rfnode properties
rfnode properties Data type Property description
custom_fields boolean This option tells the node to use field information specified here instead of that given in any upstream Type node(s). After selecting this option, specify the fields below as required.
inputs field List of the field names for input.
target field One field name for target.
fast_build boolean Utilize multiple CPU cores to improve model building.
role_use string Specify predefined to use predefined roles or custom to use custom field assignments. Default is predefined.
splits field List of the field names for split.
n_estimators integer Number of trees to build. Default is 10.
specify_max_depth Boolean Specify custom max depth. If false, nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. Default is false.
max_depth integer The maximum depth of the tree. Default is 10.
min_samples_leaf integer Minimum leaf node size. Default is 1.
max_features string The number of features to consider when looking for the best split:
  • If auto, then max_features=sqrt(n_features) for classifier and max_features=sqrt(n_features) for regression.
  • If sqrt, then max_features=sqrt(n_features).
  • If log2, then max_features=log2 (n_features).
Default is auto.
bootstrap Boolean Use bootstrap samples when building trees. Default is true.
oob_score Boolean Use out-of-bag samples to estimate the generalization accuracy. Default value is false.
extreme Boolean Use extremely randomized trees. Default is false.
use_random_seed Boolean Specify this to get replicated results. Default is false.
random_seed integer The random number seed to use when build trees. Specify any integer.
cache_size float The size of the kernel cache in MB. Default is 200.
enable_random_seed Boolean Enables the random_seed parameter. Specify true or false. Default is false.
enable_hpo Boolean Specify true or false to enable or disable the HPO options. If set to true, Rbfopt will be applied to determine the "best" Random Forest model automatically, which reaches the target objective value defined by the user with the following target_objval parameter.
target_objval float The objective function value (error rate of the model on the samples) you want to reach (for example, the value of the unknown optimum). Set this parameter to the appropriate value if the optimum is unknown (for example, 0.01).
max_iterations integer Maximum number of iterations for trying the model. Default is 1000.
max_evaluations integer Maximum number of function evaluations for trying the model, where the focus is accuracy over speed. Default is 300.