Batch job steps

You can separate tasks of a batch application into batch steps. Batch steps are implemented as local container managed Enterprise JavaBeans (EJB) that specify the com.ibm.websphere.batch.BatchJobStepLocalInterface as their business interface. Batch job steps are performed sequentially.

Callback methods in the BatchJobStepLocalInterface allow the Compute Grid endpoints to run batch steps when it runs a batch job.

A batch step EJB contains the batchable business logic to run for a portion of the batch job. Typically, a batch step contains code to read a record from a batch data stream, perform business logic with that record and then continue to read the next record. The processJobStep method of a batch step EJB is called by the Compute Grid endpoints in a batch loop. This method contains all the logic that can be batched to perform on data.

The Compute Grid endpoints invoke batch step EJB methods in a global transaction. This global transaction is managed by the Compute Grid endpoints. The behavior of the transaction, such as transaction timeout or transaction commit interval, is controlled by the checkpoint algorithm associated with the batch job to which the step belongs.

The following Compute Grid endpoints callback methods exist on the BatchJobStepLocalInterface that are invoked by the Compute Grid endpoints in the following ordered list:
  1. setProperties(java.util.Properties properties): Makes properties defined in XML Job Control Language (xJCL) available to batch step in a java.util.Properties object. This method is invoked in a global transaction.
  2. void createJobStep(): Indicates to the step that it has been initialized. Initialization logic, such as retrieving a handle to a batch data stream, can be placed here. This method is invoked in a global transaction.
  3. int processJobStep(): Repeatedly invoked by Compute Grid endpoints in a batch loop until the return code integer of this method indicates that the step has finished processing. Review BatchConstants in the batch API to see which return codes can be returned. A return code of BatchConstants.STEP_CONTINUE signals to the Compute Grid endpoints to continue calling this method in the batch loop. A return code of BatchConstants.STEP_COMPLETE indicates to the Compute Grid endpoints that the step has finished. Calling now calls destroyJobStep.
  4. int destroyJobStep() - indicates to the step that completion has occurred. The integer return code of this method is arbitrary and can be chosen by the batch application developer. This return code is saved in the Compute Grid endpoints database and represents the return code of the batch step. If the results algorithm is associated with the batch job, then this return code is passed to it. If there is a return code-based conditional logic in the xJCL of the batch job, then the Compute Grid endpoints use this return code to evaluate that logic.

The getProperties() method on the BatchJobStepLocalInterface is not currently called by the Compute Grid endpoints. The method is included in the interface for symmetry and possible later use.

Troubleshooting in batch development

  • You must declare the deployment descriptor of the batch controller bean in the Enterprise JavaBeans (EJB) deployment descriptor of a batch application, and include local EJB-references to the step enterprise bean used in a batch application. Only one controller bean can be defined per batch application.
  • Set transaction attributes of all batch step methods to required.
  • The batch application developer must ensure that transactional work done in the batch step callback methods inherits the global transaction started by the Compute Grid endpoints. This action ensures that work performed under a batch step only gets committed at every checkpoint and rolls back if the step fails.
  • If the batch step uses a batch data stream (BDS) whose data is local to the file-system of the application server to which the batch application is deployed, then certain steps must be followed to support job restart scenarios. If such a batch application is deployed to application servers that can run on multiple machines, then there is no guarantee that the restart request is accepted by the machine on which the batch job originally ran. This might occur when the batch application is deployed to a dynamic cluster that exists in a node group that has multiple node members, and if a batch job that runs against such an application is canceled and then restarted. In this scenario, the placement might send the restart request to an application server that runs on a different machine. Therefore, in cases where file-based affinity is required, you can apply the following solutions to support the job restart scenario:
    • Ensure that the data is equally available to every machine on which the batch application can be started. Use a network file system for this example. This action might reduce performance of application.
    • Deploy the application on application servers that can only run on the machine where the local data exists. Complete this action by deploying the application to a dynamic cluster that exists in a node group that has only one member node.