xgboostasnode properties

Last updated: Feb 11, 2025

XGBoost-AS node icon XGBoost is an advanced implementation of a gradient boosting algorithm. Boosting algorithms iteratively learn weak classifiers and then add them to a final strong classifier. XGBoost is very flexible and provides many parameters that can be overwhelming to most users, so the XGBoost-AS node in SPSS Modeler exposes the core features and commonly used parameters. The XGBoost-AS node is implemented in Spark.

Table 1. xgboostasnode properties
`xgboostasnode` properties	Data type	Property description
`target_field`	field	List of the field names for target.
`input_fields`	field	List of the field names for inputs.
`nWorkers`	integer	The number of workers used to train the XGBoost model. Default is `1`.
`numThreadPerTask`	integer	The number of threads used per worker. Default is `1`.
`useExternalMemory`	Boolean	Whether to use external memory as cache. Default is false.
`boosterType`	string	The booster type to use. Available options are `gbtree`, `gblinear`, or `dart`. Default is `gbtree`.
`numBoostRound`	integer	The number of rounds for boosting. Specify a value of `0` or higher. Default is `10`.
`scalePosWeight`	Double	Control the balance of positive and negative weights. Default is `1`.
`randomseed`	integer	The seed used by the random number generator. Default is 0.
`objectiveType`	string	The learning objective. Possible values are `reg:linear`, `reg:logistic`, `reg:gamma`, `reg:tweedie`, `rank:pairwise`, `binary:logistic`, or `multi`. Note that for flag targets, only `binary:logistic` or `multi` can be used. If `multi` is used, the score result will show the `multi:softmax` and `multi:softprob` XGBoost objective types. Default is `reg:linear`.
`evalMetric`	string	Evaluation metrics for validation data. A default metric will be assigned according to the objective. Possible values are `rmse`, `mae`, `logloss`, `error`, `merror`, `mlogloss`, `auc`, `ndcg`, `map`, or `gamma-deviance`. Default is `rmse`.
`lambda`	Double	L2 regularization term on weights. Increasing this value will make the model more conservative. Specify any number `0` or greater. Default is `1`.
`alpha`	Double	L1 regularization term on weights. Increasing this value will make the model more conservative. Specify any number `0` or greater. Default is `0`.
`lambdaBias`	Double	L2 regularization term on bias. If the `gblinear` booster type is used, this lambda bias linear booster parameter is available. Specify any number `0` or greater. Default is `0`.
`treeMethod`	string	If the `gbtree` or `dart` booster type is used, this tree method parameter for tree growth (and the other tree parameters that follow) is available. It specifies the XGBoost tree construction algorithm to use. Available options are `auto`, `exact`, or `approx`. Default is `auto`.
`maxDepth`	integer	The maximum depth for trees. Specify a value of `2` or higher. Default is `6`.
`minChildWeight`	Double	The minimum sum of instance weight (hessian) needed in a child. Specify a value of `0` or higher. Default is `1`.
`maxDeltaStep`	Double	The maximum delta step to allow for each tree's weight estimation. Specify a value of `0` or higher. Default is `0`.
`sampleSize`	Double	The sub sample for is the ratio of the training instance. Specify a value between `0.1` and `1.0`. Default is `1.0`.
`eta`	Double	The step size shrinkage used during the update step to prevent overfitting. Specify a value between `0` and `1`. Default is `0.3`.
`gamma`	Double	The minimum loss reduction required to make a further partition on a leaf node of the tree. Specify any number `0` or greater. Default is `6`.
`colsSampleRatio`	Double	The sub sample ratio of columns when constructing each tree. Specify a value between `0.01` and `1`. Default is`1`.
`colsSampleLevel`	Double	The sub sample ratio of columns for each split, in each level. Specify a value between `0.01` and `1`. Default is `1`.
`normalizeType`	string	If the dart booster type is used, this dart parameter and the following three dart parameters are available. This parameter sets the normalization algorithm. Specify `tree` or `forest`. Default is `tree`.
`sampleType`	string	The sampling algorithm type. Specify `uniform` or `weighted`. Default is `uniform`.
`rateDrop`	Double	The dropout rate dart booster parameter. Specify a value between `0.0` and `1.0`. Default is `0.0`.
`skipDrop`	Double	The dart booster parameter for the probability of skip dropout. Specify a value between `0.0` and `1.0`. Default is `0.0`.

Was the topic helpful?

0/1000