Use this information to resolve questions about using Synthetic Data Generator.
Typeless columns ignored for an Import node
When you use an Import node that contains Typeless columns, these columns will be ignored when you use the Mimic node. After pressing the Read Values button, the Typeless columns will be automatically set to Pass and will not be present in the final dataset.
Suggested workaround:
Add a new column in the Generate node for the missing column(s).
Size limit notice
The Synthetic Data Generator environment can import up to ~2.5GB of data.
Suggested workaround:
If you receive a related error message or your data fails to import, please reduce the amount of data and try again.
Internal error occurred: SCAPI error: The value on row 1,029 is not a valid string
For example, preview of data asset using Import node gives the following error:
Node:
Import
WDP Connector Error: CDICO9999E: Internal error occurred: SCAPI error: The value on row 1,029 is not a valid string of the Bit data type for the SecurityDelay column.
This is expected behavior. In this particular case, the 1st 1000 rows were binary, 0's or 1's. The value at row 1,029 was 3. For most flat files, Synthetic Data Generator reads the 1st 1000 records to infer the data type. In this case, Synthetic Data Generator inferred binary values (0 or 1). When Synthetic Data Generator read a value of 3 at row 1,029, it threw an error, as 3 is not a binary value.
Suggested workarounds:
- Users can adjust their
Infer_record_count
parameter to include more data, choosing 2000 rows instead (or more). - Users can update the value in the first 1000 rows that is causing the error, if this is an error in the data.
Error Mimic Data set no available input record.
The Mimic node requires the input dataset to have at least one valid record (a record without any missing values). If your dataset is empty, or if the dataset does not contain at least one valid record, clicking Run selection gives the following error message:
Node:
Mimic
Mimic Data set no available input record.
Suggested workarounds:
- Fix your dataset so that there is at least one record (row) that contains a value for every column. Click Read values from the Import node and run your flow again.
- Change Replace missing values to On from the Mimic node and run your flow again.
Error: Valid variable does not exist in metadata
Doing a migration of the Import node and then running the flow fails and gives the error:
Error: Valid variable does not exist in metadata
Suggested workaround:
Make sure that in your Import node you have at least one field that is not Typeless. For example, in the screen capture, the only field in the Import node is Typeless. At least one field that is not Typeless should be added to the Import node to avoid this error.
Cannot export data to Synthetic Data Generator .sav file
You tried to use a Data Asset Export node to export data to a Synthetic Data Generator .sav
file, but the file was not created. You also received this error message:
WDP Connector Error: CDICO9999E: Internal error occurred: IO error: Invalid variable name error: Invalid character found in field name 'AGE YOUN'. Field names can only include any letter, any digit or the symbols @, #, ., _, or $ for export.
Suggested workaround: Check whether any field names contain spaces. The .sav
file format does not support spaces in field names.
Error Failed to compute quality metrics. Details: Real data contains more columns than synthetic data
This is expected behavior as the columns do not match. This is because some real data column(s) are recognized as Typeless and are therefore excluded in the synthetic data.
Suggested workaround:
Ensure that the number of real data and synthetic data columns match by removing the Typeless column(s) from the real data in the Import node. Use the Node: Import information messages to know which columns should be removed.
Failed to compute quality metrics. Details: 'Error in evaluation: Not enough members with value: 0 for target column'
This is expected behavior as there is not enough data to perform data evaluation metrics for this column.
Suggested workaround: Ensure that this column has enough data to perform evaluation metrics.