TwoStep Cluster is an exploratory tool that is designed to
reveal natural groupings (or clusters) within a data set that would otherwise not be apparent. The
algorithm that is employed by this procedure has several desirable features that differentiate it
from traditional clustering techniques.
Handling of categorical and continuous variables. By
assuming variables to be independent, a joint multinomial-normal distribution can be placed on
categorical and continuous variables.
Automatic selection of number of clusters. By
comparing the values of a model-choice criterion across different clustering solutions, the
procedure can automatically determine the optimal number of clusters.
Scalability. By constructing a cluster feature (CF)
tree that summarizes the records, the TwoStep algorithm can analyze large data files.
For example, retail and consumer product companies regularly apply clustering
techniques to information that describes their customers' buying habits, gender, age, income level,
and other attributes. These companies tailor their marketing and product development strategies to
each consumer group to increase sales and build brand loyalty.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.