0 / 0
Passing multiple rows from an XML or JSON file to the Hierarchical stage in DataStage

Passing multiple rows from an XML or JSON file to the Hierarchical stage

You can pass multiple rows from an XML or JSON file to the Hierarchical stage as a single record. This method is needed when the XML or JSON file is located in a file storage connector.

When you create a DataStage® flow to parse an XML or JSON file that is located in file storage connector (such as IBM Cloud® Object Storage, Amazon S3, or Google Cloud Storage) to the Hierarchical Data stage, you must pass an entire XML or JSON string as one record. Normally an XML or JSON file contains multiple rows.

To parse the XML or JSON multiple rows as one record, you can use one of two methods:

Specifying file format properties for the XML or JSON file

Set the file format of the connector to delimited and set the values for field delimiter and row delimiter so that the connector reads the file format as one row and one column.

Different connectors have different settings, so be sure to refer to the connector documentation for the right values.

The following example shows the connector settings for passing a DOS-format XML file that is located in Amazon S3:
Table 1.
Setting Value
File format Delimited
Record definition None
Field delimiter ~
Note: The field delimiter can be ~ or any other character that does not already exist in the XML or JSON file, which sets the whole row as on field delimiter.
Row delimiter <CRLF>
All other settings Leave as default.

Using the Transformer stage to merge all the rows into one row

  1. Open the connector Stage tab, then open the Advanced section. Next, set the execution mode to "Sequential" and click Save.
  2. Configure the Transformer stage to merge all the rows into one row.

    In the following example, you set up a DataStage flow to have an Amazon S3 connector for a source, which links to a Transformer stage, which links to a Hierarchical stage, which links to a sequential file as a target.

    On the Transformer stage, click the Output tab, open the Columns section, then click Edit. Set the column list to contain only one column.

    1. Specify the following settings:
      Table 2.
      Setting Value
      File format Delimited
      Record definition None
      Field delimiter ~
      Row delimiter <CR>
    2. Click the Output tab, open the Columns section, then click Edit. Then, set the column list to contain only one column.

      You might set the column name to something like COLUMN_1 and the type to VARCHAR. Set the length value to be larger than the maximum length of the rows in the XML file.

    3. Click Apply and return.
    4. Open the Transformer stage and then click Add stage variable +. Name the new variable "combined."
    5. Click the calculator icon to open the expression builder. Next, set the derivation to if combined="" then Link_1.COLUMN_1 else combined:Link_1.COLUMN_1, where Link_1 is the name of the link that connects the connector and the Transformer stage. Then, click Apply and return.
    6. Open the Transformer Stage tab, then open the Advanced section. Next, set the execution mode to "Sequential" and click Save.
    7. Open the Output tab, then add a constraint setting of LastRow().
    8. In the Output section set the derivation to combined, the stage variable name that you created earlier. Change the column name to outputString and set the length to a large number greater than the whole file string length.
    9. Click Save and return.

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more