System Programming Interfaces and properties

A System Programming Interface (SPI) is a pluggable extension to the execution environment. The SPIs are invoked as part of the parallel job execution. Compute Grid SPIs are configured through a properties file that identifies the installed SPIs and their implementation class names.

SPI properties

Table 1. Attributes of a property file
Property file Attribute
Name xd.spi.properties
Location <WAS install directory>/properties
Format <SPI name>=<SPI implementation class>
All SPIs are instantiated in a WebSphere Application Server as singleton objects.

Parameterizer SPI

The PJM calls the parameterizer SPI at the start of the top-level job process. The purpose of the parameterizer SPI is to divide the top-level job into multiple subordinate jobs. The parameterizer receives the job step properties specified in the top-level job xJCL. The parameterizer SPI determines the number of subordinate jobs to be created and the input properties to be passed to each subordinate job. Typically, the input properties contain the information about which data chunk processes a particular subordinate job. Implementation of the parameterizer SPI is mandatory.

There is a default parameterizer SPI that provides the basic functionality of the generic parameterizer. To start the basic parameterizer implementation, in the xd.spi.properties file, set the spi.parallel.Parameterizer property with the com.ibm.ws.batch.parallel.BuiltInParameterizer value. To specify the number of subordinate jobs, use com.ibm.wsspi.batch.parallel.jobs=N as the input property, where N is the number of the subordinate job. To specify a unique property to a specific subordinate job instance, use the property com.ibm.wsspi.batch.parallel.prop.<property_name>.<subjob>=<value> where sub-job is the subordinate job instance (1< subjob < N). All other properties in <property_name>= <value> format are visible to all subordinate jobs.

The parameterizer SPI runs in the following environment:
  • PJM class loader scope
  • Global transaction mode

The parameterizer SPI is called when a top-level job starts and also when a top-level job restarts. During restart of the job, the parameterizer SPI can pass back a “restart instructions” object to direct which subordinate job the PJM must restart. By default, the PJM restarts any subordinate job that is in the restartable state.

Synchronization SPI

The PJM calls the synchronization SPI to demarcate the lifecycle of a logical transaction. The logical transaction is not an XA transaction; no actual resources are enlisted in it. The logical transaction is only a set of call backs to allow user code to be invoked by the PJM at key points in the lifecycle of a parallel job. The logical transaction begins before the first subordinate job is created and ends after the last subordinate job has ended.

The PJM can also rollback a logical transaction. A logical transaction is rolled back if:
  • a subordinate job ends in restartable or failed state
  • any SPI called by the PJM throws a rollback exception
The synchronization SPI runs in the following environment:
  • PJM class loader scope
  • Global transaction mode

SubJobCollector SPI

The SubJobCollector SPI collects information related to a subordinate job execution. The batch container calls the SubJobCollector SPI after each checkpoint for a subordinate job is taken. The SubJobCollector SPI enables the subordinate job to pass a data object (payload) to its owning PJM instance. The data object is then passed to the SubJobAnalyzer SPI.

The SubJobCollector SPI runs in the following environment:
  • Subordinate job application class loader scope
  • Global transaction mode

SubJobAnalyzer SPI

The SubJobAnalyzer SPI is used to analyze information collected previously by using the SubJobCollector SPI. In a typical implementation, the SubJobAnalyzer SPI is used to aggregate information obtained from all subordinate jobs to determine the consolidated return code for the top-level job. The PJM calls the SubJobAnalyzer SPI each time a SubJobCollector SPI payload is delivered to the PJM or when a subordinate job ends. Whenever a subordinate job ends, the subordinate job return code is presented to the SubJobAnalyzer.

The SubJobAnalyzer SPI runs in the following environment:
  • PJM class loader scope
  • Global transaction mode

LifeCycle SPI

The job scheduler calls the LifeCycle SPI each time a top-level job or subordinate job lifecycle event occurs. Lifecycle events include job start or job end and job step start or job step end.

The LifeCycle SPI runs in the following environment:
  • Job Scheduler class loader scope
  • Local transaction mode

Context objects

The Compute Grid runtime provides context objects that offer a common work area among SPIs and batch application programming model artifacts. A context object allows user code to save and retrieve a Java Object and share it within the context's scope. The Compute Grid runtime ensures proper cleanup of context objects and end of scope. There are two context object types:
  • ParallelJobManagerContext: Exists in the scope of a parallel job. The parameterizer, SubJobAnalyzer, and Synchronization SPIs all have access to this context for a given parallel job instance.
  • SubJobContext: Exists in the scope of a subordinate job. The SubJobCollector, and batch application programming model artifacts, BatchDataStream, BatchJobStepInterface, CheckpointPolicyAlgorithm, and ResultsAlgorithm all have access to this context for a given subordinate job instance.