0 / 0
twostepAS properties
Last updated: Jan 17, 2024
twostepAS properties

Twostep-AS node iconTwoStep Cluster is an exploratory tool that's designed to reveal natural groupings (or clusters) within a data set that would otherwise not be apparent. The algorithm that's employed by this procedure has several desirable features that differentiate it from traditional clustering techniques, such as handling of categorical and continuous variables, automatic selection of number of clusters, and scalability.

Table 1. twostepAS properties
twostepAS Properties Values Property description
inputs [f1 ... fN] TwoStepAS models use a list of input fields, but no target. Weight and frequency fields are not recognized.
use_predefined_roles Boolean Default=True
use_custom_field_assignments Boolean Default=False
cluster_num_auto Boolean Default=True
min_num_clusters integer Default=2
max_num_clusters integer Default=15
num_clusters integer Default=5
clustering_criterion
AIC
BIC
 
automatic_clustering_method
use_clustering_criterion_setting
Distance_jump
Minimum
Maximum
 
feature_importance_method
use_clustering_criterion_setting
effect_size
 
use_random_seed Boolean  
random_seed integer  
distance_measure
Euclidean
Loglikelihood
 
include_outlier_clusters Boolean Default=True
num_cases_in_feature_tree_leaf_is_less_than integer Default=10
top_perc_outliers integer Default=5
initial_dist_change_threshold integer Default=0
leaf_node_maximum_branches integer Default=8
non_leaf_node_maximum_branches integer Default=8
max_tree_depth integer Default=3
adjustment_weight_on_measurement_level integer Default=6
memory_allocation_mb number Default=512
delayed_split Boolean Default=True
fields_not_to_standardize [f1 ... fN]  
adaptive_feature_selection Boolean Default=True
featureMisPercent integer Default=70
coefRange number Default=0.05
percCasesSingleCategory integer Default=95
numCases integer Default=24
include_model_specifications Boolean Default=True
include_record_summary Boolean Default=True
include_field_transformations Boolean Default=True
excluded_inputs Boolean Default=True
evaluate_model_quality Boolean Default=True
show_feature_importance bar chart Boolean Default=True
show_feature_importance_ word_cloud Boolean Default=True
show_outlier_clusters_interactive_table_and_chart Boolean Default=True
show_outlier_clusters_pivot_table Boolean Default=True
across_cluster_feature_importance Boolean Default=True
across_cluster_profiles_pivot_table Boolean Default=True
withinprofiles Boolean Default=True
cluster_distances Boolean Default=True
cluster_label
String
Number
 
label_prefix String  
evaluation_maxNum integer The maximum number of outliers to display in the output. If there are more than twenty outlier clusters, a pivot table will be displayed instead.
across_cluster_profiles_table_and_chart Boolean Table and charts of feature importance and cluster centers for each input (field) used in the cluster solution. Selecting different rows in the table displays a different chart. For categorical fields, a bar chart is displayed. For continuous fields, a chart of means and standard deviations is displayed.