Migrating connections in DataStage
Complete the following tasks after migrating jobs that contain connections.
- Update connection information
- Specify hostname and port number information
- Update job settings for missing properties
- Update credential information
- Enable flow connections
Update connection information
Scenario | Actions |
---|---|
The connection information or credentials are directly embedded inside the job. |
|
The connection is parameterized. |
|
The properties do not have one-to-one mappings between traditional DataStage and DataStage in IBM Cloud Pak® for Data. |
|
Specify hostname and port number information
In jobs in the traditional version of DataStage, data connection information might be specified in the form of a parameter reference.
When you import one of these jobs into the modern version of DataStage, the connection information is not recognized, and the import process produces warnings. To address the warnings, manually specify the information in the DataStage job settings. Then, recompile the job.
The following table shows the connections that might have this issue, the default values of the parameters, and what fields in traditional DataStage that the parameter information is derived from.
The port
is parameterized for all connections.
Connection name in traditional DataStage | Connection name in modern DataStage | Field that the original value was derived from in traditional DataStage | Parameter that receives the parameter reference | Default value | Notes |
---|---|---|---|---|---|
Sybase Enterprise | SAP ASE | Server | host | port = 5000 | |
Sybase OC | SAP ASE | Server | host | port = 5000 | |
Db2® | IBM Db2 for DataStage | Instance | advanced.host | advanced.port = 50000 | |
Db2 Enterprise | IBM Db2 for DataStage | Server | advanced.host | advanced.port = 50000 | |
Oracle | Oracle Database for DataStage | Server | oracle_service_name | oracle_db_port = 1522 | |
Oracle Enterprise | Oracle Database for DataStage | Server | oracle_service_name | oracle_db_port = 1522 | |
Hive | Apache Hive | Server | host |
|
|
Netezza® Enterprise | IBM Netezza Performance Server for DataStage | Database | database |
|
|
FTP Enterprise | FTP | URI | host |
|
The file_name property will also be set to a default in the associated flow stage. The
default value will be <file_name>. In addition to manually updating the connection properties and parameter value, you must also change this usage property within the stage, then compile the job. The default connection mode is restricted to SSH. |
Azure Storage Connector | Microsoft Azure File Storage | accountName, accountKey | N/A | https;AccountName=accountName; AccountKey=accountKey |
If either accountName or accountKey are parameter references, the connection string is replaced with a parameter reference for that field. |
Update job settings for missing properties
When you import certain jobs from the traditional version of DataStage into the modern version, some properties are not
specified in the original job and you might need to specify them manually. DataStage automatically parameterizes some properties. For
example, host
is parameterized to #ConnParamSet_[connection
type].host#. By default, parameter sets are created for imported connections unless the
connection uses an existing parameter set, in which case parameterized properties are appended to
the set.
Oracle, ODBC, and JDBC: A special case for generating parameter sets applies to Oracle, Db2, ODBC, and JDBC. When these connection details are parameterized, parameter sets are generated with a unique identifier tag appended to the name, in the following format: ConnParamSet_[connection_type].[unique identifier tag]. A unique parameter set is generated for each unique identifier tag. For example, when two Oracle connections are migrated and each has a different Host value, two parameter sets will be generated, ConnParamSet_Oracle.host1 and ConnParamSet_Oracle.host2. If both connections have the same Host value, they will share the parameter set ConnParamSet_Oracle.host. The following properties provide the unique identifier tag for their corresponding connections:
- Db2: Database
- Oracle : Host
- ODBC: Datasource type
- JDBC: Vendor
ODBC: A special case for the parameter naming is ODBCConnectorPx and PxODBC, which might insert dsn_type into the name similar to ConnParamSet_[connectorName]_[dsn_type].hostname. A different pattern applies to JDBC, which takes the form ConnParamSet_JDBC.[vendor]_[property]. If the string portion protocol is readable, it will be carried over, otherwise it will be cut from the string and take the form ConnParamSet_JDBC.[property]. This case applies if dsn_type or protocol are available. This scenario also applies to ODBC.
You can also set the --create-connection-paramsets
flag to
false
on the cpdctl dsjob migrate
command to parameterize
connection details as job parameters instead of a parameter set asset. For example,
host
is parameterized to #ConnProp_[connection type]_host#.
This option allows you to set parameter values for each individual job. For more information on the
CLI command, see DataStage command-line tools.
After being migrated, most connections share a parameter set with all connections of the same type in the migrated flow. For example, all Teradata connections in a migrated flow will use the parameter set ConnParamSet_Teradata.
- Apache Cassandra
- local_datacenter parameterized to #ConnParamSet_Cassandra.local_datacenter# will have a default value of "datacenter1"
- Apache Hive
- database parameterized to #ConnParamSet_Hive.database# will have a default value of "default"
- DRS as Oracle
- service_name parameterized to #ConnParamSet_Oracle.service_name# will have a default value of "orcl"
- Informix® CLI
- database parameterized to #ConnParamSet_Informix.database# will have a default value of "default"
- Informix Enterprise
- database parameterized to #ConnParamSet_Informix.database# will have a default value of "default"
- Informix Load
- database parameterized to #ConnParamSet_Informix.database# will have a default value of "default"
- Informix XPS Load
- database parameterized to #ConnParamSet_Informix.database# will have a default value of "default"
- Oracle
- service_name parameterized to #ConnParamSet_Oracle.service_name# will have a default value of "orcl"
- Oracle Enterprise
- service_name parameterized to #ConnParamSet_Oracle.service_name# will have a default value of "orcl"
If the missing property is port, the import process produces warnings. To address the warnings, manually specify the information in the DataStage job settings. Then, recompile the job.
The following table shows the connections that might have missing properties, the properties that cannot be retrieved, the default value that will be given, and the field that the original value was derived from in traditional DataStage.
Connection name in traditional DataStage | Connection name in modern DataStage | Missing properties that are automatically given a default parameter | Missing properties that are automatically given a default value | Notes |
---|---|---|---|---|
ODBC Connector | ODBC |
|
|
|
ODBC Enterprise Connector | ODBC Enterprise |
|
|
If dsn_type is not in the list of allowed values, the default is DB2. |
DRS Connector | ODBC, Oracle, IBM Db2 for DataStage (not available in the modern version of DataStage) | advanced.hostname | advanced.port = 0 |
|
Informix Load | Informix |
|
port = 1526 | |
Informix Enterprise (if remote server = true) | Informix |
|
port = 1526 | |
Informix Enterprise (If remote server = false) | Informix |
|
port = 1526 | |
Informix CLI | Informix |
|
port = 1526 | |
Informix XPS Load | Informix | host | port = 1526 | |
Db2 Z Bulk Load (PxDB2Z) | Db2 for z/OS® | host | port = 446 | In migrated jobs that contain this connection, DataStage automatically converts other properties to comparable ones in modern DataStage. After you migrate jobs, make sure to check all connection properties for their proper values and provide them where it is necessary. |
Update credential information
- Google BigQuery
- Google Cloud Storage
- Google Cloud Pub/Sub
- client_id
- client_secret
- access_ token
- refresh_token
Enable flow connections
If you have a job in traditional DataStage that you want to import that has a connection, you can have the connection properties included in the flow definition when you migrate the job to modern DataStage. To migrate the job this way, you must use the dsjob CLI command to migrate the ISX file that contains the jobs. You must also enable flow connections by adding the parameter --enable-local-connection to the command.