0 / 0
Copy stage example

Copy stage example

In this example you are going to copy data from a table containing billing information for GlobalCo's customers.

You are going to copy it to three separate data sets, and in each case you are only copying a subset of the columns. The Copy stage will drop the unwanted columns as it copies the data set.

The column names for the input data set are as follows:
  • BILL_TO_NUM
  • CUST_NAME
  • ADDR_1
  • ADDR_2
  • CITY
  • REGION_CODE
  • ZIP
  • ATTENT
  • COUNTRY_CODE
  • TEL_NUM
  • FIRST_SALES_DATE
  • LAST_SALES_DATE
  • REVIEW_MONTH
  • SETUP_DATE
  • STATUS_CODE
  • REMIT_TO_CODE
  • CUST_TYPE_CODE
  • CUST_VEND
  • MOD_DATE
  • MOD_USRNM
  • CURRENCY_CODE
  • CURRENCY_MOD_DATE
  • MAIL_INVC_FLAG
  • PYMNT_CODE
  • YTD_SALES_AMT
  • CNTRY_NAME
  • CAR_RTE
  • TPF_INVC_FLAG,
  • INVC_CPY_CNT
  • INVC_PRT_FLAG
  • FAX_PHONE,
  • FAX_FLAG
  • ANALYST_CODE
  • ERS_FLAG

Here is the job that will perform the copying:

Figure 1. Example job
Shows the Copy stage being used to selectively copy data to three different data sets

The Copy stage properties are fairly simple. The only property is Force, and you do not need to set it in this instance as you are copying to multiple data sets (and InfoSphere® DataStage® will not attempt to optimize it out of the job). You need to concentrate on telling InfoSphere DataStage which columns to drop on each output link. The easiest way to do this is using the Output page Mapping tab. When you open this for a link the left pane shows the input columns, simply drag the columns you want to preserve across to the right pane. You repeat this for each link as follows:

Figure 2. Mapping tab: first output link
Shows data being mapped from the input data set to the output data set on the first output link
Figure 3. Mapping tab: second output link
Shows data being mapped from the input data set to the output data set on the second output link
Figure 4. Mapping tab: third output link
Shows data being mapped from the input data set to the output data set on the third output link

When the job is run, three copies of the original data set are produced, each containing a subset of the original columns, but all of the rows. Here is some sample data from each of the data set on DSLink6, which gives name and address information:

"GC13849","JON SMITH","789 LEDBURY ROAD"," ","TAMPA","FL","12345"
"GC13933","MARY GARDENER","127 BORDER ST"," ","NORTHPORT","AL","23456"
"GC14036","CHRIS TRAIN","1400 NEW ST"," ","BRENHAM","TX","34567"
"GC14127","HUW WILLIAMS","579 DIGBETH AVENUE"," ","AURORA","CO","45678"
"GC14263","SARA PEARS","45 ALCESTER WAY"," ","SHERWOOD","AR","56789"
"GC14346","LUC TEACHER","3 BIRMINGHAM ROAD"," ","CHICAGO","IL","67890"
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more