You can separate tasks of a batch application into batch
steps. Batch steps are implemented as local container managed Enterprise
JavaBeans (EJB) that specify the com.ibm.websphere.batch.BatchJobStepLocalInterface
as their business interface. Batch job steps are performed sequentially.
Callback methods in the BatchJobStepLocalInterface allow the Compute Grid endpoints to run batch steps
when it runs a batch job.
A batch step EJB contains the batchable business logic to run for
a portion of the batch job. Typically, a batch step contains code
to read a record from a batch data stream, perform business logic
with that record and then continue to read the next record. The processJobStep
method of a batch step EJB is called by the Compute Grid endpoints in a batch loop.
This method contains all the logic that can be batched to perform
on data.
The Compute Grid endpoints invoke
batch step EJB methods in a global transaction. This global transaction
is managed by the Compute Grid endpoints. The behavior of the transaction, such as transaction timeout or
transaction commit interval, is controlled by the checkpoint algorithm
associated with the batch job to which the step belongs.
The following
Compute Grid endpoints callback methods exist on the BatchJobStepLocalInterface that are
invoked by the
Compute Grid endpoints in the following ordered list:
- setProperties(java.util.Properties properties): Makes properties
defined in XML Job Control Language (xJCL) available to batch step
in a java.util.Properties object. This method is invoked in a global
transaction.
- void createJobStep(): Indicates to the step that it has been initialized.
Initialization logic, such as retrieving a handle to a batch data
stream, can be placed here. This method is invoked in a global transaction.
- int processJobStep(): Repeatedly invoked by Compute Grid endpoints in a batch loop until
the return code integer of this method indicates that the step has
finished processing. Review BatchConstants in the batch API to see
which return codes can be returned. A return code of BatchConstants.STEP_CONTINUE
signals to the Compute Grid endpoints to continue calling this method in the batch loop. A return code
of BatchConstants.STEP_COMPLETE indicates to the Compute Grid endpoints that the step has
finished. Calling now calls destroyJobStep.
- int destroyJobStep() - indicates to the step that completion has
occurred. The integer return code of this method is arbitrary and
can be chosen by the batch application developer. This return code
is saved in the Compute Grid endpoints database and represents the return code of the batch step. If the
results algorithm is associated with the batch job, then this return
code is passed to it. If there is a return code-based conditional
logic in the xJCL of the batch job, then the Compute Grid endpoints use this return code
to evaluate that logic.
The getProperties() method on the BatchJobStepLocalInterface is
not currently called by the Compute Grid endpoints. The method is included
in the interface for symmetry and possible later use.
Troubleshooting in batch development
- You must declare the deployment descriptor of the batch controller
bean in the Enterprise JavaBeans (EJB) deployment descriptor of a
batch application, and include local EJB-references to the step enterprise
bean used in a batch application. Only one controller bean can be
defined per batch application.
- Set transaction attributes of all batch step methods to required.
- The batch application developer must ensure that transactional
work done in the batch step callback methods inherits the global transaction
started by the Compute Grid endpoints. This action ensures that work performed under a batch step only
gets committed at every checkpoint and rolls back if the step fails.
- If the batch step uses a batch data stream (BDS) whose data is
local to the file-system of the application server to which the batch
application is deployed, then certain steps must be followed to support
job restart scenarios. If such a batch application is deployed to
application servers that can run on multiple machines, then there
is no guarantee that the restart request is accepted by the machine
on which the batch job originally ran. This might occur when the batch
application is deployed to a dynamic cluster that exists in a node
group that has multiple node members, and if a batch job that runs
against such an application is canceled and then restarted. In this
scenario, the placement might send the restart request to an application
server that runs on a different machine. Therefore, in cases where
file-based affinity is required, you can apply the following solutions
to support the job restart scenario:
- Ensure that the data is equally available to every machine on
which the batch application can be started. Use a network file system
for this example. This action might reduce performance of application.
- Deploy the application on application servers that can only run
on the machine where the local data exists. Complete this action by
deploying the application to a dynamic cluster that exists in a node
group that has only one member node.