K-Means is one of the most commonly used clustering
algorithms. It clusters data points into a predefined number of clusters. The K-Means-AS node in
SPSS Modeler is implemented in Spark.
For more information about k-means algorithms, see Clustering.1
Note: The K-Means-AS node performs one-hot encoding automatically for categorical
variables.
1"Clustering - RDD-based API." Apache Spark. MLlib: Main Guide. Aug
2024.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.