Implementing abstract methods of the Processor class (DataStage)
Your Java™ code must
implement a subclass of the Processor
class. The Processor
class
consists of methods that are invoked by the Java Integration stage. When a job that includes
the Java Integration stage starts,
the stage instantiates your Processor
class and calls
the logic within your Processor
implementations.
Processor
class provides the following
list of methods that the Java Integration
stage can call to interact with your Java code
at job execution time or at design-time.getCapabilities()
validateConfiguration()
getConfigurationErrors()
getBeanForInput()
getBeanForOutput()
getAdditionalOutputColumns()
initialize()
process()
terminate()
getColumnMetadataForInput()
getColumnMetadataForOutput()
getUserPropertyDefinitions()
public abstract boolean validateConfiguration(Configuration configuration, boolean isRuntime) throws Exception;
public abstract void process() throws Exception;
The following example shows the simple peek stage implementation that prints record column values to the job log which can be viewed in Director client. It assumes single input link.
package samples;
import com.ibm.is.cc.javastage.api.*;
public class SimplePeek extends Processor
{
private InputLink m_inputLink;
public boolean validateConfiguration(
Configuration configuration, boolean isRuntime)throws Exception
{
if (configuration.getInputLinkCount() != 1)
{
// this sample code assumes stage has 1 input link.
return false;
}
m_inputLink = configuration.getInputLink(0);
return true;
}
public void process() throws Exception
{
do
{
InputRecord inputRecord = m_inputLink.readRecord();
if (inputRecord == null)
{
// No more input. Your code must return from process() method.
break;
}
for (int i = 0; i < m_inputLink.getColumnCount(); i++)
{
Object value = inputRecord.getValue(i);
Logger.information(value.toString());
}
}
while (true);
}
}
validateConfiguration()
method to
specify the current configuration (number and types of links), and
the values for the user properties. Your Java code
must validate a given configuration and user properties and return false to Java Integration stage if there
are problems with them. In the previous example, since this code assumes
a stage that has single input link, it checks the number of input
links and returns false if the stage
configuration does not meet this requirement.if (configuration.getInputLinkCount() != 1)
{
// this sample code assumes stage has 1 input link.
return false;
}
The Configuration
interface defines
methods that are used to get the current stage configuration (number
and types of links), and the values for the user properties. The getInputLinkCount()
method
is used to get the number of input links connected to this stage.InputLink
object for
subsequent processing, and returns true to
the Java Integration stage.m_inputLink = configuration.getInputLink(0);
return true;
}
After the stage configuration is verified
by your Java code, you can interact
with the stages connected in your job. The process()
method
is an entry point for processing records from the input link or to
the output link. When a row is available on any of the stage input
links (if any and whatever the number of output links is), the Java Integration stage calls this
method, if the job does not end. Your Java code
must consume all rows from the stage input links.
readRecord()
method of the InputLink
interface,
your Java code can consume a
row from the input link. It returns an object that implements the InputRecord
interface.
The InputRecord
interface defines methods that are
used to get column data from a consumed row record.InputRecord inputRecord = m_inputLink.readRecord();
if (inputRecord == null)
{
// No more input. Your code must return from process() method.
break;
}
getValue(int columnIndex)
method of the InputRecord
interface.
The getColumnCount()
in InputLink
returns
the number of columns that exist in this input link.for (int i = 0; i < m_inputLink.getColumnCount(); i++)
{
Object value = inputRecord.getValue(i);
information()
method
of the Logger
class. The Logger
class
allows your Java code to write
the data to job log with specified log levels. The following code
writes the string representation of each column value to a job log.Logger.information(value.toString());