binningnode properties

Binning node iconThe Binning node automatically creates new nominal (set) fields based on the values of one or more existing continuous (numeric range) fields. For example, you can transform a continuous income field into a new categorical field containing groups of income as deviations from the mean. Once you have created bins for the new field, you can generate a Derive node based on the cut points.

Example

node = stream.create("binning", "My node")
node.setPropertyValue("fields", ["Na", "K"])
node.setPropertyValue("method", "Rank")
node.setPropertyValue("fixed_width_name_extension", "_binned")
node.setPropertyValue("fixed_width_add_as", "Suffix")
node.setPropertyValue("fixed_bin_method", "Count")
node.setPropertyValue("fixed_bin_count", 10)
node.setPropertyValue("fixed_bin_width", 3.5)
node.setPropertyValue("tile10", True)
Table 1. binningnode properties
binningnode properties Data type Property description
fields [field1 field2 ... fieldn] Continuous (numeric range) fields pending transformation. You can bin multiple fields simultaneously.
method FixedWidth EqualCount Rank SDev Optimal Method used for determining cut points for new field bins (categories).
recalculate_bins Always IfNecessary Specifies whether the bins are recalculated and the data placed in the relevant bin every time the node is executed, or that data is added only to existing bins and any new bins that have been added.
fixed_width_name_extension string The default extension is _BIN.
fixed_width_add_as Suffix Prefix Specifies whether the extension is added to the end (suffix) of the field name or to the start (prefix). The default extension is income_BIN.
fixed_bin_method Width Count  
fixed_bin_count integer Specifies an integer used to determine the number of fixed-width bins (categories) for the new field(s).
fixed_bin_width real Value (integer or real) for calculating width of the bin.
equal_count_name_ extension string The default extension is _TILE.
equal_count_add_as Suffix Prefix Specifies an extension, either suffix or prefix, used for the field name generated by using standard p-tiles. The default extension is _TILE plus N, where N is the tile number.
tile4 flag Generates four quantile bins, each containing 25% of cases.
tile5 flag Generates five quintile bins.
tile10 flag Generates 10 decile bins.
tile20 flag Generates 20 vingtile bins.
tile100 flag Generates 100 percentile bins.
use_custom_tile flag  
custom_tile_name_extension string The default extension is _TILEN.
custom_tile_add_as Suffix Prefix  
custom_tile integer  
equal_count_method RecordCount ValueSum The RecordCount method seeks to assign an equal number of records to each bin, while ValueSum assigns records so that the sum of the values in each bin is equal.
tied_values_method Next Current Random Specifies which bin tied value data is to be put in.
rank_order Ascending Descending This property includes Ascending (lowest value is marked 1) or Descending (highest value is marked 1).
rank_add_as Suffix Prefix This option applies to rank, fractional rank, and percentage rank.
rank flag  
rank_name_extension string The default extension is _RANK.
rank_fractional flag Ranks cases where the value of the new field equals rank divided by the sum of the weights of the nonmissing cases. Fractional ranks fall in the range of 0–1.
rank_fractional_name_ extension string The default extension is _F_RANK.
rank_pct flag Each rank is divided by the number of records with valid values and multiplied by 100. Percentage fractional ranks fall in the range of 1–100.
rank_pct_name_extension string The default extension is _P_RANK.
sdev_name_extension string  
sdev_add_as Suffix Prefix  
sdev_count One Two Three  
optimal_name_extension string The default extension is _OPTIMAL.
optimal_add_as Suffix Prefix  
optimal_supervisor_field field Field chosen as the supervisory field to which the fields selected for binning are related.
optimal_merge_bins flag Specifies that any bins with small case counts will be added to a larger, neighboring bin.
optimal_small_bin_threshold integer  
optimal_pre_bin flag Indicates that prebinning of dataset is to take place.
optimal_max_bins integer Specifies an upper limit to avoid creating an inordinately large number of bins.
optimal_lower_end_point Inclusive Exclusive  
optimal_first_bin Unbounded Bounded  
optimal_last_bin Unbounded Bounded