CICS® recovery manager uses the type-of-restart indicator in its domain state record from the global catalog to determine which type of restart it is to perform. This indicator operates as follows:
The operation of the recovery manager’s control record can be modified by running the recovery manager utility program, DFHRMUTL. This can set an autostart record that determines the type of start CICS is to perform, effectively overriding the type of start indicator in the control record. See the CICS Operations and Utilities Guide for information about using DFHRMUTL to modify the type of start performed by START=AUTO.
If you shut down a CICS region normally, CICS restarts with a warm restart if you specify START=AUTO. For a warm start to succeed, CICS needs the information stored in the CICS catalogs at the previous shutdown, and the information stored in the system log.
In a warm restart, CICS:
CICS also uses information from the warm keypoint in the system log.
For more information about the warm restart process, see CICS warm restart.
If a CICS region fails, CICS restarts with an emergency restart if you specify START=AUTO. An emergency restart is similar to a warm start but with additional recovery processing--for example, to back out any transactions that were in-flight at the time of failure, and thus free any locks protecting resources.
If the failed CICS region was running with VSAM record-level sharing, SMSVSAM converts into retained locks any active exclusive locks held by the failed system, pending the CICS restart. This means that the records are protected from being updated by any other CICS region in the sysplex. Retained locks also ensure that other regions trying to access the protected records do not wait on the locks until the failed region restarts. See the CICS Application Programming Guide for information about active and retained locks.
For non-RLS data sets (including BDAM data sets), any locks (ENQUEUES) that were held before the CICS failure are re-acquired.
Most of CICS initialization following an emergency restart is the same as for a warm restart, and CICS uses the catalogs and the system log to restore the state of the CICS region. Then, after the normal initialization process, emergency restart performs the recovery process for work that was in-flight when the previous run of CICS was abnormally terminated.
During the final stage of emergency restart, the recovery manager uses the system log data to drive backout processing for any UOWs that were in-flight at the time of the failure. The backout of UOWs during emergency restart is the same as a dynamic backout; there is no distinction between the backout that takes place at emergency restart and that which takes place at any other time.
The recovery manager also drives:
The recovery manager drives these backout and commit processes because the condition that caused them to fail may be resolved by the time CICS restarts. If the condition that caused a failure has not been resolved, the UOW remains in backout- or commit-failed state. See Backout-failed recovery and Commit-failed recovery for more information.
For more information about the emergency restart process, see CICS emergency restart.
On a cold start, CICS reconstructs the state of the region from the previous run for remote resources only. For all resources, the region is built from resource definitions specified on the GRPLIST system initialization parameter and those resources defined in control tables.
The following is a summary of how CICS uses information stored in the global catalog and the system log on a cold start:
Generally, to perform a cold start you specify START=COLD, but CICS can also force a cold start in some circumstances when START=AUTO is specified. See the CICS System Definition Guide for details of the effect of the START parameter in conjunction with various states of the global catalog and the system log.
If you want to initialize a CICS region without reference to the global catalog from a previous run, perform an initial start. You can do this by specifying START=INITIAL as a system initialization parameter, or by running the recovery manager’s utility program (DFHRMUTL) to override the type of start indicator to force an initial start.
See the CICS Operations and Utilities Guide for information about the DFHRMUTL utility program.
If a CICS region is connected to an SMSVSAM server when the server fails, CICS continues running, and recovers using a process known as dynamic RLS restart. An SMSVSAM server failure does not cause CICS to fail, and does not affect any resource other than data sets opened in RLS mode.
When an SMSVSAM server fails, any locks for which it was responsible are converted to retained locks by another SMSVSAM server within the sysplex, thus preventing access to the records until the situation has been recovered. CICS detects that the SMSVSAM server has failed the next time it tries to perform an RLS access after the failure, and issues message DFHFC0153. The CICS regions that were using the failed SMSVSAM server defer in-flight transactions by abending UOWs that attempt to access RLS, and shunt them when the backouts fail with "RLS is disabled" responses. If a unit of work is attempting to commit its changes and release RLS locks, commit failure processing is invoked when CICS first detects that the SMSVSAM server is not available (see Commit-failed recovery).
RLS mode open requests and RLS mode record access requests issued by new units of work receive error responses from VSAM when the server has failed. The SMSVSAM server normally restarts itself without any manual intervention. After the SMSVSAM server has restarted, it uses the MVS™ event notification facility (ENF) to notify all the CICS regions within its MVS image that the SMSVSAM server is available again.
CICS performs a dynamic equivalent of emergency restart for the RLS component, and drives backout of the deferred work.
Recovery after the failure of an SMSVSAM server is usually performed automatically by CICS. CICS retries any backout-failed and commit-failed UOWs. In addition to retrying those failed as a result of the SMSVSAM server failure, this also provides an opportunity to retry any backout failures for which the cause has now been resolved. Manual intervention is required only if there are units of work which, due to the timing of their failure, were not retried when CICS received the ENF signal. This situation is extremely unlikely, and such units of work can be detected using the INQUIRE UOWDSNFAIL command.
Note that an SMSVSAM server failure causes commit-failed or backout-failed units of work only in the CICS regions registered with the server in the same MVS image. Transactions running in CICS regions in other MVS images within the sysplex are affected only to the extent that they receive LOCKED responses if they try to access records protected by retained locks owned by any CICS regions that were using the failed SMSVSAM server.
[[ Contents Previous Page | Next Page Index ]]