Configuring the Excel stage as a target (DataStage)
You can configure the Excel stage to generate a Microsoft Excel file. Each input link of the Excel stage is associated with a separate Microsoft Excel sheet. The Microsoft Excel sheets are named associated with the input links that are as Sheet1, Sheet2, Sheet3, and so on. Each column of the input link is mapped to a Microsoft Excel column.
The Excel stage supports only the OOXML (.xlsx) format of Microsoft Excel files as the target file.
The Excel stage supports runtime column propagation. When runtime column propagation is enabled on an output link of an upstream stage, additional propagated columns are appended after columns that are defined in the Excel stage.
The Excel stage does not support generating .xls files or password-encrypted files.
- To configure the Excel stage as a target:
-
- On the parallel canvas, double-click the Excel stage and open the Properties section of the details pane.
- From the Write mode list, select Create a file.
- In the Output file section, specify the following settings:
- Optional: Select Create data asset to create a file as a data asset when the job run is completed. The new data asset appears in the list of data assets on the Assets tab of your project.
- Optional: Specify the Write method. If you select Generate multiple files, the Excel stage creates multiple Microsoft Excel files based on additional properties settings. If you select Specific file, the Excel stage creates a Microsoft Excel file with the name that is specified in the File name property.
- In the File name field, specify the full path name of the Microsoft Excel file to which you want to write the data. Use the path name /ds-storage/PXRuntime/Projects/<projectid>/<filename>.
- Optional: Specify the File update mode. If you select Create (Error if exists), your job execution fails if the target Microsoft Excel file already exists. If you select Overwrite, the Excel stage overwrites the existing file.
- In the Properties section, specify the following:
- Optional: Select Adjust column width. If you choose Yes, the Excel stage adjusts each column width in the generated Microsoft Excel sheet to fit the column contents.
- Click Save.
Writing data to multiple Excel files
You can use the Excel stage to write data to multiple Microsoft Excel files when you have a large amount of data.
The maximum number of records that is supported by the OOXML format of Microsoft Excel (.xlsx) is 1,048,576. When the input links have more than 1,048,576 records, you must divide the records into multiple Microsoft Excel files. Even if the links do not have more than 1,048,576 records, you might want to write them to multiple Microsoft Excel files because opening a large Microsoft Excel file requires a large amount of memory.
- To write data to multiple Excel files:
-
- On the parallel canvas, double-click the Excel stage.
- From the Write mode list, select Create a file.
- In the Output file section, specify the following:
- From the Write method list, select Generate multiple files. The Excel stage creates multiple Microsoft Excel files in the location that is specified in the File name field.
- In the File name field, specify the location where you want to create
multiple Microsoft Excel files.
You can specify the prefix of the file names following the file location. For example, if you want to create Microsoft Excel files at C:\tmp and use Sample as the prefix of the file names, specify C:\tmp\Sample in the File name field. The Excel stage appends a three-digit sequential number and file extension (.xlsx) to the prefix. If the number of files exceeds 999, the file name contains the required number of digits.
- Optional: Specify the File update mode. If you select Create (Error if exists), your job execution fails if the target Microsoft Excel file already exists. If you select Overwrite, which is the default setting, the Excel stage overwrites the existing file.
- In the Properties section, specify the following:
- Specify the Maximum number of rows in a sheet. The default is 65536. This number includes a column name row when you select Column names from the Column header list.
- Click OK.
Writing data to an existing Excel file
You can use Excel stage to design jobs that write data to an existing Microsoft Excel file or copy an Excel file and write data to the copy. One Microsoft Excel file can be updated by one Excel stage at a time. You can have only one Excel stage that updates the same Microsoft Excel file in one job.
When the Excel stage writes data to an existing Microsoft Excel file, the stage writes only the cell data. If a Microsoft Excel cell that the stage writes to has any format, then the stage keeps the existing format. If the Microsoft Excel file has a formula or a graph that refers to cells that are written to by the Excel stage, then the formula or the graph is recalculated when it is opened by Microsoft Excel.
You can write to Microsoft Excel columns from any DataStage® columns. The names and order of DataStage columns and Microsoft Excel columns do not have to match. If the Microsoft Excel sheet has a header in the first row, you can configure the Excel stage so that values in the first row are used to determine the column that records are written to. You can write up to 1,048,576 rows of a Microsoft Excel sheet. The source DataStage columns must be defined in the design time.
- To write data to an existing file:
-
- On the canvas, double-click the Excel stage.
- On the Input tab, select the input link from the Input name field.
- On the Stage tab, from the Write mode list, select Modify existing file
- Specify the target file details to write the data to.
- In the Output file section, select Create data asset to create an asset and add it to the Assets tab automatically.
- In the File name field, specify the name of the file to write data to.
- If you update more than 32 columns in the Microsoft Excel sheet, specify the number of the column in the Number of columns to load field.
- If you select First row is header in the Properties
section, enter a value in the Start writing from this row
field.
A value should be greater than or equal to 2 or greater than or equal to 1.
- Click Edit in the Map section, then click Import columns.
- Specify the Microsoft Excel details to import in the
Import page.
- From the Sheet list, select the Microsoft Excel sheet to update.
- From the list of columns, select the Microsoft Excel columns to update.
- Click Import.
- Map the imported Microsoft Excel columns to the DataStage columns that are defined in the input link. Select a column to write the data.
- Repeat the steps for the Properties section and mapping for all the input links.
- Click Save.