0 / 0

Using stored procedures in DataStage

Last updated: Mar 12, 2025
Using stored procedures in DataStage

A stored procedure is a block of procedural constructs and embedded SQL statements. Use the Call procedure option under the Read method or Write method property of the Stage tab to call an existing stored procedure from a database. Use the Call procedure statement option to write your own procedure statement syntax in the text box.

Supported connectors

The following connectors support stored procedures. Stored procedures can be run on connectors with input links, output links, or both.

  • Amazon RDS for PostgreSQL
  • IBM Cloud® Databases for PostgreSQL
  • IBM Db2®
  • IBM® Db2 Big SQL
  • IBM Db2 on Cloud
  • IBM Db2 for i
  • IBM Db2 Warehouse
  • IBM Db2 for z/OS®
  • Google BigQuery
  • Microsoft Azure SQL Database
  • Microsoft SQL Server
  • Microsoft Azure Synapse Analytics
  • Oracle
  • PostgreSQL
  • SAP ASE
  • Snowflake
  • Teradata

Stored procedures in a source stage

Select Add procedure return value to schema to add the procedure return code and message to the schema as columns ProcCode and ProcMess. The first column on the output link holds the value of the code. The second column holds the value of the message. If you also select Forward row data, the input columns are added to the output link after the ProcCode and ProcMess columns.

Stored procedures in a target stage

Select Execute procedure for each row to execute the procedure for each row of the input data. If you do not select Execute procedure for each row, the procedure will only execute once, for the first row. By default, Execute procedure for each row is selected. This procedure is not supported by the Snowflake connector.

Syntax

Stored procedure calls use ? as a parameter marker. The following syntax calls a stored procedure with name <procedure_name> on an input parameter @<column_name> and output column ?, both in parentheses.
CALL <procedure_name>(@<column_name>, ?)
If the input parameter is a string, enclose the name in quotation marks.
? = CALL <procedure_name>('@<string_column_name>', ?)

In the Stage tab, under General source properties if you choose Read method > Call procedure statement and select Add procedure return value to schema, then two columns (ProcCode and ProcMess) are added to the output as the first columns in the output schema.

In Stage tab, under SQL > Call procedure statement:
  • For connectors (except Microsoft SQL Server and SAP ASE), which support stored procedures, always provide the following statement:
    CALL (@column_name, ?)
  • For Microsoft SQL Server and SAP ASE connectors, the syntax of ? = is valid and the procedure return code is captured in ? and sent to the output column. You always need to provide the following statement:
    ? = CALL (@column_name, ?)

If the stored procedure is called in a stage with only output links, then ? always indicates an output parameter. If the stored procedure is called in a stage with only input links, then ? always indicates an input parameter.


CALL <procedure_name>()
CALL <procedure_name>(?, ?)
CALL <procedure_name>(4,?)

A stored procedure for PostgreSQL connector can not return multiple rows into output parameters. To get multiple rows, use the table for user-defined functions instead.

Parameter values

In the Parameter values section of the Stage tab, use parameter specifications to map the input and output parameters of your stored procedure to input and output columns. This determines which columns data is returned in. Parameter names in the Stage tab should match parameter names in your stored procedure. Each parameter name can be mapped to a single column. If a mapping is not provided, the output data for each column is returned in the order the columns are specified. A connector with an input link will have input parameters, a connector with an output link will have output parameters, and a connector with both input and output links will have both.

When a connector has both input and output links, you can edit the Marker / Literal value for your input parameters on the Stage tab. You can specify an input parameter name as:
@
The default value for the parameter. The column value that is specified in the mapping is passed to the parameter.
Literal
A constant specified as an input parameter. The value of the constant is used, and the value from the input link is discarded.
Local parameter or a parameter from a parameter set
For a parameter from a parameter set, use (#name#) format. The value of the parameter is used, and the value from the input link is discarded.

Each parameter must be mapped to the input column. For example, P1 should be mapped to COLUMN_1, and P2 to COLUMN_2. Parameter mapping does not replace the column definition. However, you can discard the value from an input link and use the value in the Marker / Literal column.

Optional parameters

You can use optional parameters only for the stored procedures. User-defined functions do not support optional parameters. A stored procedure can have one or more optional parameters. If you want to omit optional parameters while calling a stored procedure, use Call procedure statement. If a stored procedure has one or more optional parameters, all optional parameters must be defined after output parameters. Optional parameters with the Null value must be grouped at the end.

The following connectors support the optional parameter in stored procedures:
  • IBM Db2
  • IBM Db2 Big SQL
  • IBM Db2 on Cloud
  • IBM Db2 for i
  • IBM Db2 Warehouse
  • Microsoft Azure SQL Database
  • Microsoft SQL Server
  • Oracle
  • Snowflake

Example: A stored procedure MYSCHEMA.MYSP has two required input parameters (inparam1, inparam2), two optional inputs (optinparam3, optinparam4), and two output parameters. See the following examples of valid and invalid use of Call procedure statement:

Valid:

In this case the Input value for optinparam3 and optinparam4 is used:
CALL MYSCHEMA.MYSP(inparam1, inparam2, ?, ?, optinparam3, optinparam4)
In this case the Input value for optinparam3 is used, and default value for optinparam4 is used:
CALL MYSCHEMA.MYSP(inparam1, inparam2, ?, ?, optinparam3)
In this case the default value for both optinparam3 and optinparam4 is used:
CALL MYSCHEMA.MYSP(in1, in2, ?, ?)
Invalid:
CALL MYSCHEMA.MYSP(inparam1, inparam2, optinparam3, optinparam4, ?, ?)
CALL MYSCHEMA.MYSP(inparam1, inparam2, ?, ?, , optinparam4)

Query timeout

For long-running queries and stored procedures, set the Query timeout property to limit the number of seconds that a procedure runs for. The default value is 300 seconds. If you set a value greater than 600 seconds, you must also edit the environment variable $OSH_JOB_START_TIMEOUT and set a value higher than the Query timeout. This property is available on Microsoft SQL Server, Microsoft Azure SQL Database, IBM Db2, IBM Db2 Big SQL, IBM Db2 on Cloud, IBM Db2 for i, IBM Db2 Warehouse, and IBM Db2 for z/OS.

User-defined functions

When Call procedure or Call procedure statement is selected, you can select User defined function to specify a user-defined function to execute. A user-defined function can take input parameters and returns either a single value (for scalar functions) or a set of values (for table functions). This property is available on Amazon RDS for PostgreSQL, Db2, IBM Cloud Databases for PostgreSQL, IBM Db2 Big SQL, IBM Db2 on Cloud, IBM Db2 for i, IBM Db2 Warehouse, IBM Db2 for z/OS, PostgreSQL, Snowflake, and Teradata. In IBM Db2 for z/OS, user-defined functions must be called as part of a SELECT statement. In Teradata, macro-type functions are also available.

User-defined functions support both source and transform modes. For the Source mode, the data is read from the database. For the Transform mode, the data is read, transformed, and written to the target node.

You can use two types of user-defined functions:
  • Scalar
    Returns a single value.
  • Table
    Returns a table.
When a function has no input parameters, it can be executed when the database acts as a source. When a function has input parameters, it can be executed when the database acts as a transformer.
Use the following procedure statements for your user-defined functions:
  • For scalar user-defined function:
    SELECT  “Sclr_udf_name”(@arg1,@arg2,…);
  • For table user-defined function:
    SELECT * FROM  “table_udf_name”(@arg1,@arg2,…);