Sharing DataStage artifacts with all IBM Cloud Object Storage containers
IBM Cloud Object storage is used to store IBM® DataStage® artifacts such sequential files, data sets, and file sets. Set up IBM Cloud Object Storage to store these artifacts. After the IBM Cloud Object Storage container is set up, it can be accessed across different runtime containers and used by different stages in your data flows.
On the Cloud, DataStage jobs can run in different runtime containers. If the DataStage artifacts such as sequential files, data sets, and file sets are written to a local disk of those containers, they will not be accessible for other jobs that might be in other containers. So, these artifacts are written to IBM Cloud Object Storage, which is accessible from any of the containers.
- Sequential Files (text/binary)
- Data sets (binary)
- File sets (text)
- Lookup file sets (text)
- Schema files (text)
- Range map files (binary)
DataStage/datasets
DataStage/files
DataStage/schema
Data sets, file sets, and Lookup file sets
Data sets, file sets, and Lookup file sets are created by IBM DataStage when you are working with a data flow. Data sets, file sets, and lookup file sets are stored as descriptor files. These files contain information about where the actual data is located, as well as the data file names and their locations.
All of the descriptor files are written to the DataStage/datasets/
directory.
All of the data files that belong to these data sets, file sets, or lookup file sets get stored in
the DataStage/data/
directory. The names and paths of the descriptor files cannot
be prefixed with cos://
. The prefix is not supported.
Sequential Files
All of the sequential files that are created by using the Sequential File stage are stored in and
read from the DataStage/files/
directory. For example,
DataStage/files/sequential_file.txt
. File sets and Lookup file sets are some of
the files that are created by the Sequential File stage. If the path to the sequential file starts
with “cos://”
, then the file is created in the top-level directory in the Cloud
Object Storage bucket.
Schema files
Schema files are read and written by IBM
DataStage flows
from the DataStage/schemas/
directory, unless the file path to the files starts
with “cos://”
. If the path starts with “cos://”
, then the files
will be in the top-level directory in the Cloud Object Storage bucket. For example, you would
specify schemafile.txt to access that particular file under the directory
DataStage/schemas/.
Schema files are created manually and are uploaded and read from stages. From the options section in the stage editor, you can specify the location of a schema file that you want to use in a stage.
- Row Generator
- Sequential File
- Fileset
- Column Import
- Column Export
- Transformer
File pattern
File patterns that start with a common prefix name are supported. All other file patterns are not supported.