Last updated: Jan 18, 2024
XGBoost is an advanced implementation of a gradient boosting algorithm. Boosting algorithms iteratively learn weak classifiers and then add them to a final strong classifier. XGBoost is very flexible and provides many parameters that can be overwhelming to most users, so the XGBoost-AS node in SPSS Modeler exposes the core features and commonly used parameters. The XGBoost-AS node is implemented in Spark.
xgboostasnode properties |
Data type | Property description |
---|---|---|
target_field |
field | List of the field names for target. |
input_fields |
field | List of the field names for inputs. |
nWorkers
|
integer | The number of workers used to train the XGBoost model. Default is 1 . |
numThreadPerTask
|
integer | The number of threads used per worker. Default is 1 . |
useExternalMemory
|
Boolean | Whether to use external memory as cache. Default is false. |
boosterType
|
string | The booster type to use. Available options are gbtree ,
gblinear , or dart . Default is gbtree . |
numBoostRound
|
integer | The number of rounds for boosting. Specify a value of 0 or higher. Default
is 10 . |
scalePosWeight
|
Double | Control the balance of positive and negative weights. Default is 1 . |
randomseed
|
integer | The seed used by the random number generator. Default is 0. |
objectiveType
|
string | The learning objective. Possible values are reg:linear ,
reg:logistic , reg:gamma , reg:tweedie ,
rank:pairwise , binary:logistic , or multi . Note
that for flag targets, only binary:logistic or multi can be used.
If multi is used, the score result will show the multi:softmax and
multi:softprob XGBoost objective types. Default is
reg:linear . |
evalMetric
|
string | Evaluation metrics for validation data. A default metric will be assigned according to the
objective. Possible values are rmse , mae ,
logloss , error , merror ,
mlogloss , auc , ndcg , map , or
gamma-deviance . Default is rmse . |
lambda |
Double | L2 regularization term on weights. Increasing this value will make the model more
conservative. Specify any number 0 or greater. Default is
1 . |
alpha |
Double | L1 regularization term on weights. Increasing this value will make the model more
conservative. Specify any number 0 or greater. Default is
0 . |
lambdaBias |
Double | L2 regularization term on bias. If the gblinear booster type is used, this
lambda bias linear booster parameter is available. Specify any number 0 or greater.
Default is 0 . |
treeMethod |
string | If the gbtree or dart booster type is used, this tree
method parameter for tree growth (and the other tree parameters that follow) is available. It
specifies the XGBoost tree construction algorithm to use. Available options are
auto , exact , or approx . Default is
auto . |
maxDepth |
integer | The maximum depth for trees. Specify a value of 2 or higher. Default is
6 . |
minChildWeight |
Double | The minimum sum of instance weight (hessian) needed in a child. Specify a value of
0 or higher. Default is 1 . |
maxDeltaStep |
Double | The maximum delta step to allow for each tree's weight estimation. Specify a value of
0 or higher. Default is 0 . |
sampleSize |
Double | The sub sample for is the ratio of the training instance. Specify a value between
0.1 and 1.0 . Default is 1.0 . |
eta |
Double | The step size shrinkage used during the update step to prevent overfitting. Specify a value
between 0 and 1 . Default is 0.3 . |
gamma |
Double | The minimum loss reduction required to make a further partition on a leaf node of the tree.
Specify any number 0 or greater. Default is 6 . |
colsSampleRatio |
Double | The sub sample ratio of columns when constructing each tree. Specify a value between
0.01 and 1 . Default is1 . |
colsSampleLevel |
Double | The sub sample ratio of columns for each split, in each level. Specify a value between
0.01 and 1 . Default is 1 . |
normalizeType |
string | If the dart booster type is used, this dart parameter and the following three dart parameters
are available. This parameter sets the normalization algorithm. Specify tree or
forest . Default is tree . |
sampleType |
string | The sampling algorithm type. Specify uniform or weighted .
Default is uniform . |
rateDrop |
Double | The dropout rate dart booster parameter. Specify a value between 0.0 and
1.0 . Default is 0.0 . |
skipDrop |
Double | The dart booster parameter for the probability of skip dropout. Specify a value between
0.0 and 1.0 . Default is 0.0 . |