gtpc3m0l | Concepts and Structures |
Tightly coupled multiprocessing means more than one I-stream engine within an ESA configuration is available to process programs held in main storage that is shared among the I-stream engines. This requires the synchronization of multiple I-stream engines simultaneously that are processing sequential programs, with the common goal of increasing the number of messages that can be processed. When the system programs share data, also held in the shared main storage, the synchronization is accomplished with a TPF system mechanism called a processor lock. Locks, in general, are the implementation of the concept of mutual exclusion.
A processor lock is used to permit system programs, executing in multiple I-stream engines in an ESA configuration, to modify system tables held in the shared main storage. In general, access to a shared system table occurs within the framework of other processing that does not require locking.
A processor lock is intended for the exclusive use of the system programs. A processor lock uses an indicator that system programs reference before entering a critical region where programs executing in any I-stream engine can modify shared variables. If a processor lock indicator is set, the program checking the lock indicator must wait until the lock indicator is released. The TPF system normally uses the test and set (TS) instruction to force an interlock across multiple I-stream engines in order to set the lock indicator. In the TPF system, the activities of testing, setting, resetting, and waiting (or spinning) are all considered to be part of a processor lock.
The record hold table is an example of a system table where processor locking is performed. The record hold table is a shared table accessed by system programs from multiple I-stream engines (within an ESA configuration) in the course of servicing an I/O request. Placing data into the record hold table is done by the critical region of a system program running on a single I-stream engine because the structure of the code is serially reusable. Refer to Figure 13 while reading the following description.
Figure 13. Relationship of Locks
The items in the record hold table identify records that are held for the exclusive use of a single process (Entry) during a record update. Through the use of a file read macro (called the FIND and HOLD macro), an Entry identifies one record that is about to be modified. A reentrant application program that services multiple Entries has no need to even be aware of the record hold table, but only needs to know the macros used to hold and release a record.
The file address of a record placed in the record hold table is a lock indicator that shows the exclusive use of the record by one Entry within an ESA configuration (not to be confused with the processor lock on the record hold table itself). As long as the file address of a record is in the record hold table, observation of the application programming protocols prevents any other Entry from updating the record.
When an XLF is installed, the file address is given to the XLF at the time I/O commands are given to the channel subsystem to access the record. Placing a file address in the record hold table does not necessarily mean that the I/O commands are issued immediately; this depends upon the queues to the relevant module CUs. The contents of the XLF table (lock table) located in the module CU or CF may indicate that the record is already being used by some other central processing complex (CPC). The file address of a record (an entry in the lock table) locks the record for one Entry within a loosely coupled complex of several CPCs that are all attached to the same shared module. See Figure 11 for more information.
The TPF system places the identity of records held by all Entries within an ESA configuration into the record hold table. Processor locking is called when an item is placed into the record hold table. Keep in mind that the record hold table is a single table, located in shared main storage that can be accessed by the same system code processing in all the I-stream engines within an ESA configuration. A field within the record hold table is used to hold the processor lock indicator. The processor lock prevents corruption of the record hold table. This could occur if two or more I-streams simultaneously attempt to process the single copy of system code used to place the identity of a record into the record hold table.
Three separate points of synchronization for locking shared main storage data, such as the record hold table, to note in the TPF system are:
In the TPF system, the I-stream engine waiting mechanism is called a spin lock because any delayed I-stream engine is not doing anything but looping; that is, spinning, while waiting to access the locked resource. Fortunately, an I-stream engine is seldom delayed with a spin lock. Unfortunately, to use a section of code that accesses shared data, each I-stream engine must incur the overhead of the lock mechanism.
A side comment about the record hold table is instructive. The reason that there is a single record hold table in an ESA configuration is due to the nature of the application environment: in a TPF system, all applications are assumed to need access to the same underlying database. Although record locking is essential, seldom does contention occur once a lock indicator is placed in the table. The small performance penalty for setting the lock is worth the large performance improvement gained by permitting concurrent accesses to a very large database. The system complexity for managing a single shared table is simpler and causes less delay than attempting to place separate tables in a private area for each of the I-streams. Separate tables would require complex logic to coordinate.
In the case of the example of the record hold table, there is another level of synchronization: manipulating the data in the locked record. The application programs sharing the data observe a protocol to not update a record if it is locked. The delay for gaining access to a locked record may be relatively long, and the TPF system treats this contention as an I/O delay, whereupon multiprogramming is invoked to switch to a different Entry (process).
In a tightly coupled multiprocessing environment, portions of system programs that are shared in main storage are structured as reentrant code. Prior to tightly coupled multiprocessing, the TPF system had little need for reentrant system code, because within a single I-stream engine, system programs depended upon completing a service request before being invoked to process an additional request. The reentrant system code is useful because in a tightly coupled environment, a single copy of such code is simultaneously executed by more than one I-stream engine. If all system programs were to rely upon finishing before being reentered, prohibitive processor contention would occur. Most of the system programs that service application macro requests are reentrant programs.
The choice of a programming structure and related locking mechanisms is usually a compromise between complexity and performance. Restructuring tables and code can become complicated and more time consuming than development schedules permit. However, excessive locking defeats the purpose of multiprocessing.
The decision to handle non-module-related subclasses of I/O interrupts on a single I-stream engine sheds some insight on such issues. A relatively large number of heavily accessed tables are used by the system I/O programs. To restructure all these tables and associated code would be a formidable task, and to lock on each access to a table would be a severe performance penalty. Statistical modeling shows that a good compromise is to restrict the processing of a subclass of I/O interrupts to a single I-stream engine, thereby avoiding locking for a minimal performance penalty. The penalty is incurred by the need to transfer each I/O request to the I-stream engine that manages I/O from the I-stream engine that issued the request but cannot issue the hardware I/O command.
Most module-related I/O is handled by all I-streams. This requires that module-related tables must be locked. If module I/O was done on only one I-stream, that one I-stream would soon become overloaded doing only module I/O processing.
A program restricted to run on a particular I-stream engine is said to have a CPU affinity. A program with a CPU affinity has no need to call locks associated with critical regions if the data modified by the program is not shared by programs running on other I-stream engines. So, a program is given a CPU affinity if it accesses particular main storage tables that are not accessed by any other programs; this eliminates the overhead of locking. Note, however, that only one copy of the program can run at any one time in the I-stream engine. Moreover, it is not appropriate to assign a CPU affinity if there are several programs that access a single main storage table. Of course, CPU affinity does not restrict a program from using locks to access data that is shared. But, incorporating a lock into an existing program represents a modification. A program designed to be serially reusable in a uniprocessor environment operates satisfactorily without modification in a tightly coupled multiprocessor environment if the program is restricted to run on a single I-stream engine, and if none of the data that it modifies is shared with programs running on other I-stream engines.
Examples of some TPF system functions that must run with a CPU affinity are non-module-related I/O interrupt processing, timer service, and command processing. Commands are called operator messages or commands in other operating systems.
The fundamental reasons for assigning a CPU affinity to a system program are:
The trade-offs for CPU affinity are:
In the TPF system, for the purposes of CPU affinity, I-stream engines are assigned either as the main I-stream engine or as an application I-stream engine, (there is only one main I-stream engine in an ESA configuration). The I-stream engine (CPU) to which system programs are assigned CPU affinity is determined during initial program load (IPL). The assignments vary with the number of I-stream engines existing in the ESA configuration being IPLed. Applications (that is, Entries) can be run in any I-stream engine category. In a uniprocessor environment, all programs have CPU affinity to the only available I-stream engine, which serves both as the main I-stream engine and the application I-stream engine.
There are some system programs that must be assigned an affinity to a unique I-stream engine within an ESA configuration, called the main I-stream engine. This I-stream engine is the one that is IPLed from the functional console when the TPF system is first being loaded. The timer service is an example of a function that must be assigned an affinity to the main I-stream engine. The main I-stream engine can be used for running any serially reusable system program that modifies data that is not shared among the other I-stream engines in the ESA configuration. So the main points are:
Some system programs must be assigned an affinity to the main I-stream engine.
An I-stream engine to which no system programs or only a restricted set is assigned CPU affinity is called an application I-stream engine. These are all the I-stream engines in an ESA configuration, if any, other than the main I-stream engine.
The Multi-Processor Interconnect Facility (MPIF) feature is a function that can be assigned CPU affinity. By design, it can be assigned to the application I-stream engine that is designated as I-stream engine 2.
There are some performance implications resulting from the way that tightly coupled multiprocessing is implemented in the TPF system:
Although this performance degradation in a uniprocessor environment is somewhat dependent on the application design, it is considered to be worth this penalty because greater performance and flexibility can be achieved in a tightly-coupled environment. Moreover, the underlying speed of current large processor models more than compensates for the overhead of locking facilities.