K-means is one of the most commonly used clustering
algorithms. It clusters data points into a predefined number of clusters. The K-Means-AS node in
SPSS Modeler is implemented in Spark. For more information about k-means algorithms, see Clustering.1
Note: The K-Means-AS node performs one-hot encoding automatically for categorical variables.
Table 1. kmeansasnode properties
kmeansasnode Properties
Values
Property description
roleUse
string
Specify predefined to use predefined roles, or custom to
use custom field assignments. Default is predefined.
autoModel
Boolean
Specify true to use the default name ($S-prediction) for
the new generated scoring field, or false to use a custom name. Default is
true.
features
field
List of the field names for input when the roleUse property is set to
custom.
name
string
The name of the new generated scoring field when the autoModel property is
set to false.
clustersNum
integer
The number of clusters to create. Default is 5.
initMode
string
The initialization algorithm. Possible values are k-means|| or
random. Default is k-means||.
initSteps
integer
The number of initialization steps when initMode is set to
k-means||. Default is 2.
advancedSettings
Boolean
Specify true to make the following four properties available. Default is
false.
maxIteration
integer
Maximum number of iterations for clustering. Default is 20.
tolerance
string
The tolerance to stop the iterations. Possible settings are 1.0E-1,
1.0E-2, ..., 1.0E-6. Default is 1.0E-4.
setSeed
Boolean
Specify true to use a custom random seed. Default is
false.
randomSeed
integer
The custom random seed when the setSeed property is
true.
displayGraph
Boolean
Select this option if you want a graph to be included in the output.
1"Clustering - RDD-based API." Apache Spark. MLlib: Main Guide. Aug
2024.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.