About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Last updated: Feb 11, 2025
Hierarchical Density-Based Spatial Clustering (HDBSCAN)© uses
unsupervised learning to find clusters, or dense regions, of a data set. The HDBSCAN node in SPSS
Modeler exposes the core features and commonly used parameters of the HDBSCAN library. The node is
implemented in Python, and you can use it to cluster your dataset into distinct groups when you
don't know what those groups are at first.
properties |
Data type | Property description |
---|---|---|
custom_fields | boolean | This option tells the node to use field information specified here instead of that given in any upstream Type node(s). After selecting this option, specify the following fields as required. |
|
field | Input fields for clustering. |
|
boolean | Specify or to enable or disable Hyper-Parameter
Optimization (HPO) based on Rbfopt, which automatically discovers the optimal combination of
parameters so that the model will achieve the expected or lesser error rate on the samples. Default
is . |
|
integer | The minimum size of clusters. Specify an integer. Default is . |
|
integer | The number of samples in a neighborhood for a point to be considered a core point. Specify an
integer. If set to , the is used. Default is
. |
|
string | Specify which algorithm to use: , ,
, , , or
. Default is . |
|
string | Specify which metric to use when calculating distance between instances in a feature array:
, , , ,
, , ,
, , , or
. Default is . |
|
boolean | Specify to use a string cluster label, or to use
a number cluster label. Default is . |
|
string | If the parameter is set to , specify a
value for the string label prefix. Default prefix is . |
|
boolean | Specify to accept an approximate minimum spanning tree, or
if you are willing to sacrifice speed for correctness. Default is
. |
|
string | Specify the method to use for selecting clusters from the condensed tree:
or . Default is (Excess of Mass
algorithm). |
|
boolean | Specify if you want to allow single cluster results. Default is
. |
|
double | Specify the to use if you're using for
the metric. Default is . |
|
integer | If using a space tree algorithm ( , or
), specify the number of points in a leaf node of the tree. Default
is . |
|
boolean | Specify or to control whether the Validity Index
chart is included in the model output. |
|
boolean | Specify or to control whether the Condensed Tree
chart is included in the model output. |
|
boolean | Specify or to control whether the Single Linkage
Tree chart is included in the model output. |
|
boolean | Specify or to control whether the Min Span Tree
chart is included in the model output. |
|
Was the topic helpful?
0/1000