gtpc3m0f | Concepts and Structures |
Performance sensitivity is part of the fabric of the principal function offered by the TPF system, that is, processing messages that are requests for information from a large centralized database.
The TPF system is designed on the assumption that each of the end users generates messages (the component parts of transactions) that require only small or trivial amounts of CPU processing. CPU processing per message is trivial when compared with the delays inherent in the communication facilities and in accessing information from the large database. The term trivial processing is used in computer science literature to describe work units that require very little processing resource, which certainly characterizes the work units managed by the TPF system. However, since trivial processing is easily confused with an unimportant request, the phrase expeditious processing is used in this section to suggest prompt and efficient service of presumably important requests, each requiring very little CPU processing power. The meaning of performance within the environments using the TPF system dictates this expeditious processing assumption.
Performance within the TPF system means a designated response to a benchmark message at a designated message rate (given in messages per second). This aspect of performance is based upon the system response time to any end user message and not upon the length of time to complete a transaction. Of course, if the system response time fluctuates greatly, the performance of the end users is affected. Within a given installation, an example of a service level statement can be:
"The system will be capable of providing a three second message response time to 95% of the end users during the intervals when the system is also processing 1500 messages per second. For any message rate, the response time is based upon an average message that causes 50 000 CPU instruction executions and 10 accesses to data on the physical DASD surfaces."
The benchmark message is an important ingredient for making specific performance statements. A business enterprise contemplating installation of a TPF system must identify its benchmark message. This requires familiarity with the envisioned applications as well as familiarity with the TPF system structure. The number of file accesses, for example, depends upon an awareness of the TPF data base support; the number of instructions executed depends upon application processing as well as the instructions executed to provide system services such as obtaining messages from a communications network. The identification of a benchmark message is nontrivial and is a joint effort done by application and system personnel. The performance issues are only indirectly the subject matter of this publication. The subject matter of this publication is the system structure in support of the computing resources, which are influenced by performance issues.
Response time is relative to an individual end user who is not interested in the resource utilization or how many other users happen to be using the system. It turns out that an end user is usually most interested in a response at the very same time that many other users are also interested in their responses. That is, all the end users are busy simultaneously as, for example, can be the case for bank tellers at various branches throughout a city during lunch hour.
A line representing response time in Figure 5 shows the principal delays that contribute to the time needed to respond to a message. This figure is a representation of a message requiring a total elapsed time of 3 seconds from the moment an end user makes a request at a terminal or workstation until the reply begins to appear at the user's terminal or workstation; the total elapsed time is called response time in the TPF vernacular. The measurement of response time begins at the instant the end user enters a request (message) and includes the time that the message travels over communication lines to a communication controller, and arrives through the channel subsystem at a CPU in a central processing complex (CPC). At the CPC, programs are invoked to process the message by accessing a large database, formatting a reply message, and sending the reply. The time required for this to occur is also included in the response time.
Figure 5 shows a CPU occupancy of 1/2 second or 500 milliseconds (ms). CPU occupancy includes the creation of a control block that identifies the programs and data necessary to perform the message processing. Although not all the programs and data need reside in main storage during the processing of the message, the control block does. During this CPU occupancy, most of the time is usually spent waiting for I/O to be completed and very little time executing CPU instructions. The CPU occupancy of any given message consists of several intermixed processing intervals and I/O delays. Because the I/O gaps represent most of the delay while the message is in the CPU, the number of channels to secondary storage, queueing disciplines, and the organization of data are very important in order to maintain fast response times at peak periods; these issues are related to minimizing the length of the I/O gaps.
The contribution of the processing intervals to the response time delay of a single message is very slight. A very slight processing interval, of course, depends upon the power of the CPU doing the processing. In the example response time line of Figure 5, the I/O delays can account for 494 milliseconds and the processing intervals for 6 milliseconds. Clearly, if the system responds to only one user, the CPU can be orders of magnitude slower without any dramatic change in the single user's response. Multiprogramming is employed to utilize a single CPU on behalf of other messages for other transactions during the I/O delays for a given message. Multiprocessing is used to allow multiple CPUs to share the processing load of very high message rates.
TPF systems are generally used in environments where economies are realized by an affinity of large volumes of data with sufficiently powerful CPUs to service thousands of users at peak periods. The TPF system is used in some environments to hold shared network data in order to distribute processing function throughout a network of processing centers. For example, the system is used in credit verification applications to process a credit inquiry or to route the inquiry between an agent and an appropriate processing center.
Figure 5. Response Time (per Message)
The number of messages processed over a given interval of time is called system throughput. A business enterprise must identify its projected peak message rate in order to assess whether the TPF system is an appropriate solution or not.
The number of CPU instructions required to process a message is frequently called path length. The instructions include the application processing as well as system services, such as receiving the message, transmitting the response, accessing the online database, and processing interrupts. The path length of the response time line given in Figure 5 is represented by the thin vertical processing interval lines. The path length of a benchmark message is a composite, or average, of the path lengths of different kinds of messages actually processed in the system. The number of instructions executed each second by a CPU over a peak period is obtained by multiplying the path length of the benchmark message by the average number of messages processed each second during a peak period. Note that sufficient resources to deliver messages to the CPU must be available.
The instructions to be executed over the peak period is generally given in million instructions per second (MIPS), which identifies the processing power required of a CPU to handle the message rate. This, of course, can also be stated in terms of messages per second, a statement of throughput, not response time. The throughput requirement of a single CPU can be shown by the summation of a sufficient number of processing interval lines that all add up to the solid dark line in Figure 6. The solid dark line is obtained by summing the maximum number of messages that can be processed by a given CPU during the interval determined by the CPU occupancy of a single message. In the example of Figure 5, this interval is given as 1/2 second. Therefore, the number of response lines in Figure 5 required to saturate the CPU (shown in Figure 6) represents the maximum number of messages that a given CPU can process in 1/2 second.
The relationship between MIPS and messages per second is simple:
MIPS = (Messages per Second * Instructions per Message) / 1,000,000
Figure 6. Throughput (Messages per Second)
An almost obvious fact from queuing theory shows that if all CPUs are utilized 100% of the time, there are messages waiting to be processed. This does not mean that a CPU at 100% utilization is not capable of delivering its peak capacity. It does mean that a message arriving at a processing center must wait in line for a CPU to become available. (As an example consider what happens to you during the busy lunch hour activity at a bank within a large city. In this case, you are the message and the teller is the CPU. Assume that fatigue has not overcome the teller, a fair assumption to be made about real CPUs.) The waiting time to get into the CPU is incorporated in the wavy line of Figure 5. The waiting time becomes very large for every message at 100% CPU utilization. Therefore, a rule of thumb when planning a TPF configuration is to design for CPU utilization not to exceed a measured 85% during peak periods. This value allows for variations in the message load during peak periods and growth projection, as well as for deviations from your system plan.
This utilization is based upon a little theory and many years of observations. Thus, the MIPS in the previous formula are divided by .85, which has the effect of increasing the necessary CPU power to deliver the required response time at the designated peak load. This should provide an idea of the relationship between response time and system throughput; both are related to performance but are in conflict with each other. (For example, you get better service at the bank if not all the tellers are busy; good for you; bad for the bank.)
An accurate path length of a message must include all CPU instructions executed. This translates to mean:
Every instruction executed must count against a message in an environment dedicated to the fast response time of each message during the periods of high volume message processing.
The TPF system architecture is designed to keep the path length for message processing as short as possible. In other operating systems, the path length of a message increases at peak loads while the path length of a message in the TPF system is relatively constant. In other words, the instructions executed for system processing required at peak message rates are not much different than the instructions required at lower rates. The instructions executed for application processing are assumed to be constant. (In some non-TPF systems, the path length of a message increases significantly as the message load increases because of the complexity of handling additional control blocks.)
Within the TPF system, performance means:
Fast response to expeditious processing demands at peak message volumes. Fast, expeditious, and peak must all be identified by the customer with some assistance from a TPF consultant. The TPF system structure in support of a central processing complex (CPC) is designed to keep the processing on behalf of each message as trivial as possible. A TPF system configuration and the system software are influenced by the requirement for consistent response times at peak loads.
Messages per second represent the throughput component of performance. The TPF system incorporates architectural philosophy to handle the response time component of performance, as well as system availability. This may differ from operating systems, that do not emphasize transaction processing. Architecturally, response time is improved by:
For example, the logic to access data is integrated in the system software that, in other systems, is frequently found in application appendages called access methods.
A user installation must customize this aspect of the system to match its unique application requirements.
The design of the TPF system is influenced by performance and availability requirements, and results in an architecture with the following characteristics:
For example, multiple outages of one minute over a one month period may be acceptable whereas one outage of multiple minutes would be unacceptable.
The system access techniques are structured to improve access paths to shared data and to minimize the amount of time required for system restarts.
A block has roughly the same meaning as the storage management unit called a page in virtual storage systems.
The system collection of most of the variables necessary for performance measurement do not bias normal online operation because the collection is, in essence, analogous to a gauge, that is, operative whether or not measurements are analyzed. For some measurements, statistical samples are periodically taken throughout the interval of interest. Thus, the overhead that the measurement facilities introduce is almost indiscernible.
These tools are used to execute new and modified applications in a simulated environment. The test tools are used to ensure that applications adhere to interfaces in the TPF system. This eliminates redundant and costly system overhead for validity checking on each and every message in the production environment.
The units of work are called messages or Entries (not tasks or job steps). There is no such thing as Job Control Language (JCL) or a job entry subsystem (JES); the TPF analogue to JES is communications control in the TPF system.
The difference between the transaction environment in the TPF system and classical batch-oriented operating systems can usually be traced to the following statements about expeditious processing:
The TPF expeditious processing statement, statement #1, is based on the assumption that all work (individual messages) arriving at a CPU requires small or trivial amounts of the CPU resource. The non-trivial processing statement, statement #2, is based on the assumption that at least some of the individual requests require large amounts of CPU resources. As processing power becomes less expensive, many of the reasons for making the second assumption become less important, for example, provide a separate processor for each of the users involved in complex calculations. On the other hand, an increasing number of data processing installations are used to allow business agents to access shared data resources, in which case the need for systems designed to the first assumption become more important. The required power of the processor, using the first assumption, is not a matter of computational complexity but the sheer volume of requests, each requiring only a small amount of the processor resources. A request that requires expeditious processing could be, "Get my powerful processor some shared data," or "Get me to a powerful processor".
Finally, although a TPF system has the capability to communicate with an external system, this discussion of performance applies only to messages that are processed by a CPU within a TPF system. There are other performance considerations when processing is performed by a system external to a central processing complex (CPC) even if the external system is TPF-based. A discussion of performance involving an external system is beyond the scope of this publication.