kmeansasnode properties
Last updated: Feb 12, 2025
K-means is one of the most commonly used clustering
algorithms. It clusters data points into a predefined number of clusters. The K-Means-AS node in
SPSS Modeler is implemented in Spark. For more information about k-means algorithms, see Clustering.1
Note: The K-Means-AS node performs one-hot encoding automatically for categorical variables.
Properties |
Values | Property description |
---|---|---|
|
string | Specify to use predefined roles, or to
use custom field assignments. Default is . |
|
Boolean | Specify to use the default name ( ) for
the new generated scoring field, or to use a custom name. Default is
. |
|
field | List of the field names for input when the property is set to
. |
|
string | The name of the new generated scoring field when the property is
set to . |
|
integer | The number of clusters to create. Default is . |
|
string | The initialization algorithm. Possible values are or
. Default is . |
|
integer | The number of initialization steps when is set to
. Default is . |
|
Boolean | Specify to make the following four properties available. Default is
. |
|
integer | Maximum number of iterations for clustering. Default is . |
|
string | The tolerance to stop the iterations. Possible settings are ,
, ..., . Default is . |
|
Boolean | Specify to use a custom random seed. Default is
. |
|
integer | The custom random seed when the property is
. |
|
Boolean | Select this option if you want a graph to be included in the output. |
1 "Clustering - RDD-based API." Apache Spark. MLlib: Main Guide. Aug 2024.
Was the topic helpful?
0/1000