0 / 0
Encode stage in DataStage

Encode Stage

The Encode stage encodes a data set using a UNIX encoding command, such as gzip, that you supply.

The Encode stage is a processing stage. The stage converts a data set from a sequence of records into a stream of raw binary data. The companion Decode stage reconverts the data stream to a data set (see Decode stage).

An encoded data set is similar to an ordinary one, and can be written to a data set stage. You cannot use an encoded data set as an input to stages that performs column-based processing or re-orders rows, but you can input it to stages such as Copy. You can view information about the data set in the data set viewer, but not the data itself. You cannot repartition an encoded data set, and you will be warned at runtime if your job attempts to do that.

When you double-click the Encode stage, the properties panel opens. The properties panel has three tabs:

  • Stage. This is always present and is used to specify general information about the stage.
  • Input. This is where you specify details about the data being grouped or aggregated.
  • Output. This is where you specify details about the groups being output from the stage.
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more