Last updated: Jan 18, 2024
The Partition node generates a partition field, which splits the data into separate subsets for the training, testing, and validation stages of model building.
Example
node = stream.create("partition", "My node")
node.setPropertyValue("create_validation", True)
node.setPropertyValue("training_size", 33)
node.setPropertyValue("testing_size", 33)
node.setPropertyValue("validation_size", 33)
node.setPropertyValue("set_random_seed", True)
node.setPropertyValue("random_seed", 123)
node.setPropertyValue("value_mode", "System")
partitionnode properties |
Data type | Property description |
---|---|---|
new_name
|
string | Name of the partition field generated by the node. |
create_validation
|
flag | Specifies whether a validation partition should be created. |
training_size
|
integer | Percentage of records (0–100) to be allocated to the training partition. |
testing_size
|
integer | Percentage of records (0–100) to be allocated to the testing partition. |
validation_size
|
integer | Percentage of records (0–100) to be allocated to the validation partition. Ignored if a validation partition is not created. |
training_label
|
string | Label for the training partition. |
testing_label
|
string | Label for the testing partition. |
validation_label
|
string | Label for the validation partition. Ignored if a validation partition is not created. |
value_mode
|
System
SystemAndLabel
Label
|
Specifies the values used to represent each partition in the data. For example, the training
sample can be represented by the system integer 1 , the label
Training , or a combination of the two, 1_Training . |
set_random_seed
|
Boolean | Specifies whether a user-specified random seed should be used. |
random_seed
|
integer | A user-specified random seed value. For this value to be used,
set_random_seed must be set to True . |
enable_sql_generation
|
Boolean | Specifies whether to use SQL pushback to assign records to partitions. |
unique_field
|
Specifies the input field used to ensure that records are assigned to partitions in a random
but repeatable way. For this value to be used, enable_sql_generation must be set to
True . |