About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Last updated: Feb 11, 2025
The Auto Cluster node estimates and compares clustering
models, which identify groups of records that have similar characteristics. The node works in the
same manner as other automated modeling nodes, allowing you to experiment with multiple combinations
of options in a single modeling pass. Models can be compared using basic measures with which to
attempt to filter and rank the usefulness of the cluster models, and provide a measure based on the
importance of particular fields.
Example
node = stream.create("autocluster", "My node") node.setPropertyValue("ranking_measure", "Silhouette") node.setPropertyValue("ranking_dataset", "Training") node.setPropertyValue("enable_silhouette_limit", True) node.setPropertyValue("silhouette_limit", 5)
Properties |
Values | Property description |
---|---|---|
|
field |
Note: Auto Cluster node only. Identifies the field for which an importance value will be calculated.
Alternatively, can be used to identify how well the cluster differentiates the value of this field
and, therefore, how well the model will predict this field.
|
|
|
|
|
|
|
|
integer | Number of models to list in the report. Specify an integer between 1 and 100. |
|
flag | |
|
integer | Integer between 0 and 100. |
|
flag | |
|
number | Real number between 0.0 and 1.0. |
|
flag | |
|
number | Integer greater than 0. |
|
flag | |
|
|
|
|
number | |
|
integer | Integer greater than 0. |
|
flag | |
|
|
|
|
number | |
|
integer | |
|
flag | |
|
number | |
|
flag | |
|
|
|
|
number | Integer between 0 and 100. |
|
number | Integer between 0 and 100. |
|
flag | Enables or disables the use of a specific algorithm. |
|
string | Sets a property value for a specific algorithm. See Setting algorithm properties for more information. |
|
integer | |
|
boolean | (K-Means, Kohonen, TwoStep, SVM, KNN, Bayes Net and Decision List models only.)
Sets a maximum time limit for any one model. For example, if a particular model requires an unexpectedly long time to train because of some complex interaction, you probably don't want it to hold up your entire modeling run. |
|
integer | Time spent on model build. |
|
boolean | (Neural Network, K-Means, Kohonen, TwoStep, SVM, KNN, Bayes Net and C&R Tree models
only.) Stops a run after a specified number of hours. All models generated up to that point will be included in the model nugget, but no further models will be produced. |
|
double | Run time limit (hours). |
|
boolean | Stops a run when a model passes all criteria specified under the Discard settings. |
Was the topic helpful?
0/1000