OMEGAMON XE on z/OS provides full support for IBM Tivoli Monitoring V6.2.1 or later and compatibility and exploitation of z/OS V1.9 and 1.10. In addition, V4.2.0 incorporates new features, enhances existing functionality, and continues migration of functionality from OMEGAMON classic and OMEGAMON II for MVS.
OMEGAMON XE on z/OS V4.2.0 is designed to run on Tivoli Monitoring V6.2.1 or later and provides support for all Tivoli Monitoring V6.2.1 features. Read about these features in IBM Tivoli Monitoring: User's Guide.
OMEGAMON XE on z/OS supports a mixture of V3.1.0, V4.1.0 and V4.2.0 monitoring agents in your environment during the migration period so that you can deploy new V4.2.0 monitoring agents to z/OS systems and subsystems along with older monitoring agents of the same product. For more information about running in this mixed environment, refer to the IBM Tivoli OMEGAMON XE and Tivoli Management Services on z/OS: Upgrade Guide.
In V4.2.0, the function modification identifier (FMID) for OMEGAMON XE on z/OS and OMEGAMON II for MVS have been merged to reflect the fact that OMEGAMON II for MVS is a component and not an independent product. The FMIDs for OMEGAMON II (CUA and "Classic") 3270 components have been merged into the same FMID as the OMEGAMON XE base product to simplify product installation and configuration. All XE agents and 3270 interface modules are now consistently delivered in a single FMID per product, across the entire OMEGAMON 420 family of products. OMEGAMON II components have been reversioned to V4.2.0 for consistency with the base product.
The following common components and SMP/E FMIDs have been reversioned to reflect the current version of Tivoli Management Services:
In the Tivoli Enterprise Portal, all components are identified by the version number in the Managed System Status table view. For example, the Version column for rows in this table view might display the following entries:
The new version numbering feature helps you to quickly identify the version of each monitoring agent in an environment that comprises multiple versions.
In addition, the version number appears in the description of queries in the Situation editor. Multiple queries with the same name and description are required to support upgrade scenarios. The addition of a version number enables you to identify which query is appropriate for a particular agent at a particular version.
A portion of the OMEGAMON XE on z/OS DASD data collection processing is redirected to IBM System z Integrated Information Processors (zIIPs) where available. This frees up the standard processors for other work and can reduce software licensing costs.
Redirection of processing occurs by default, but you can disable the offloading by adding a KM5ZIIPOFFLOAD=NO to the KDSENV file using the new nonstandard parameters editing facility in the Configuration Tool.
Suspend and spin locks are part of the serialization functionality in z/OS. Occasionally, tasks can be significantly delayed if a lock holder fails to release the resource in a timely manner. For z/OS 1.10 and higher, OMEGAMON XE on z/OS now provides monitoring services for system locks. RMF data collection must be enabled for lock data to be available.
The suspend and spin lock data has been added to the Enqueue and Reserve Summary workspace. The name of this workspace has been changed to Enqueue, Reserve, and Lock Summary to reflect this addition.
This enhancement provides the capability to monitor Workload Manager (WLM) service and report classes that ran work units at a promoted dispatching priority.
It is no longer unusual for users to run zSeries systems at full capacity for extended periods. As a result, workloads may wait for extended time periods. In many cases this is not a problem: work of lower importance work must wait until resources become available. However, in some cases this may cause a priority inversion, as when, for example, low priority work obtains a resource that higher importance work is waiting on, but is blocked by a large CPU consumer of medium importance. The high priority work is now in effect blocked by the medium priority work.
With z/OS 1.9, WLM addresses this issue by granting limited, or trickle, CPU access to work units that cannot get hold of a CPU for an extended period of time. WLM periodically examines the IN queue and identifies work units that have been CPU starved. Then, the dispatching priority of these work units is temporarily raised (promoted) to allow execution of a small number of instructions (so-called trickle support). The assumption is that such short periods of CPU access do not harm high importance work and could help low importance work to release locks and other critical resources.
OMEGAMON XE on z/OS XE V4.2.0 allows users to identify and monitor workloads that may be exploiting the blocked workload enhancement. The WLM Service Class Resources workspace now identifies workloads that ran work units at a promoted dispatching priority and indicates the percentage of time that the workload ran at a promoted priority.
Resource Management Facility (RMF) data is now available for system lock and cross-coupling facility (XCF) data, as well as for coupling facility data. A configuration variable allows you to configure collection for all RMF-supplied data, for lock data only, or for coupling facility and cross-coupling facility data only. The default is no RMF-supplied data.
The z/OS Distributed File Service (DFS™) zSeries File System (zFS) is a z/OS UNIX file system that can be used in addition to the Hierarchical File System (HFS). zFS provides significant performance gains in accessing files approaching 8K in size that are frequently accessed and updated. The access performance of smaller files is equivalent to that of HFS.
OMEGAMON XE on z/OS V4.2.0 introduces monitoring of statistics for zFS kernel data, storage, metadata cache, directory cache, and user cache.
OMEGAMON XE on z/OS V4.2.0 also introduces monitoring of UNIX System Services socket usage:
Capacity provisioning management (CPM), introduced on the System z10 platform, provides the ability to automatically “provision" physical processors to and from a Central Processor Complex (CPC). As processors are provisioned, the capacity of the CPC changes and two new capacity indicators, Model Temporary Capacity and Model Permanent Capacity, come in to play.
Currently, a single Model Capacity is represented by the model number associated with the full capacity rating of an individual machine (for example, 2097-718), as well as the maximum millions of service units (MSU) capacity that can be delivered. With the introduction of On/Off Capacity On Demand (OOCoD) and Capacity Backup (CBU), the effective Model Number associated with different numbers of provisioned CPUs changes, based on the new aggregate MSU capacity that can be delivered. These changes affect software licensing charges based on the capacity increases and decreases.
OMEGAMON XE on z/OS V4.2.0 reports on all three model capacity indicators and their associated MSU capacity ratings instead of Model Capacity, reported in previous versions. Reporting of the new model capacities and associated capacity ratings provides information that can be used to determine what CPU resource is available within a CPC for delivery to workloads at any point in time, or within an historical timeframe. The identifiers reflect manual actions performed to increase/decrease CPC capacity or the effects of automatic provisioning as determined by a CPM Policy.
To address increasing workload demand for processor cycles and high speed memory access, z/OS on the IBM System z10 implements a new approach to dispatching work. HiperDispatch throughput improvements have been achieved by making z/OS aware of the underlying physical topology of configured processors. z/OS can use this awareness to attempt to re-dispatch a unit of work repeatedly on the same physical CPU, or collection of physically adjacent CPUs (an affinity node), to increase the chances of obtaining data from processor L1, L1.5 or L2 cache, instead of incurring a time delay by going to main memory.
Support for HiperDispatch was introduced as an interim feature in OMEGAMON XE on z/OS V4.1.0, to coincide with the launch of System z10. A new HiperDispatch Management attribute has added to the System CPU Utilization attribute group and is displayed in the System CPU Utilization workspace. A link from the workspace connects to a new HiperDispatch Details workspace, which displays data from two new attribute groups: HiperDispatch Logical Processors attributes, which show processor type and HiperDispatch statistics like priority, share percentage, and status for each processor configured in an LPAR, and HiperDispatch Management attributes, which provide information about the weight and status of a logical processor when the LPAR is in HiperDispatch mode.
In addition, the OMEGAMON for MVS HDSP command now displays HiperDispatch management metrics.
Extended Address Volumes (EAV) is a new concept, introduced in z/OS V1R10 (1.10), designed to deal with the continued growth in z/OS disk storage requirements and the constraints that users currently face in that area. EAV increases the amount of addressable DASD storage per volume beyond the 65,520 cylinder limit to an architectural maximum of 268,434,453 cylinders per volume by changing how tracks on Extended Count Key Data (ECKD) volumes are addressed. This allows the z/OS operating system, which is limited to 65,535 devices, to satisfy future disk storage requirements by increasing the individual capacity, rather than the total number, of DASD volumes.
Support for EAV has been also been added to OMEGAMON for MVS. The SEEK, SVOL, DSN, and DSNV OMEGAMON commands have been updated.
z/OS V1.9 introduced the ability to reuse address space identifiers (ASIDs). ASID reuse provides relief for z/OS users who currently have to schedule IPLs to reclaim ASIDs that have become non-reusable (that is, address spaces that produce message “IEF352I ADDRESS SPACE UNAVAILABLE" on termination), by allowing them to create address spaces reusable ASIDs.
OMEGAMON XE on z/OS V4.2.0 introduces support for this function by allowing OMEGAMON XE on z/OS started tasks to be started with the new z/OS REUSASID=YES start command parameter.
PTF UA37434 added support to the OMEGAMON for MVS interface for the following data:
For detailed information on new commands, see IBM Tivoli OMEGAMON XE on z/OS: OMEGAMON for MVS User's Guide and the online help.
A new z/OS System Overview workspace summarizes key performance aspects of the LPAR. This workspace is the default workspace for each managed system item in the Navigator.
This attribute, added to the Address Space Bottlenecks attribute group, helps you identify looping jobs.
The CPU Loop Index is a percentage value representing the sum of all CPU, zIIP, zIIP on CP, zAAP, and zAAP on CP using and waiting counts, divided by total sample count. For CPU looping jobs, this value is usually above 98%.
The CPU Loop Index appears in the Address Space Bottlenecks Summary workspace, the Address Space Bottlenecks Detail workspace, and the Address Space Bottlenecks and Impact Analysis workspace. A new situation, KM5_CPU_Loop_Warn alerts you to potentially looping address spaces. (Note that very CPU intensive jobs may read high without being in a loop, so this value is a guide, not a guarantee.)
If you are using IBM Tivoli Enterprise Console (TEC) or IBM Tivoli Netcool/OMNIbus, in addition to IBM Tivoli Monitoring, to manage events in your enterprise, you can now forward events reported by OMEGAMON XE on z/OS monitoring agents to these event management products. The benefits of these products and the details of how they can be integrated with IBM Tivoli Monitoring are described in the "Event integration scenarios" section of IBM Tivoli Monitoring: Installation and Setup Guide.
Before events can be forwarded, event forwarding must be enabled on the hub monitoring server, and a default destination server must be defined. In addition, the TEC or Netcool/OMNIbus server (the event server) must be configured to receive the events, a situation update forwarding process must be installed on the event server, and, for events forwarded to TEC a baroc file for the agent must be installed and imported on the event server. The IBM Tivoli Monitoring: Installation and Setup Guide provides detailed instructions for enabling event forwarding from a distributed Tivoli Enterprise Monitoring Server and for configuring TEC and OMNIbus to receive the events, including the installation of the event synchronization component and installing the .baroc files. IBM Tivoli Monitoring: Configuring IBM® Tivoli Enterprise Monitoring Server on z/OS® provides instructions for configuring a hub monitoring server on z/OS and locating the agent baroc files.
After situation forwarding is enabled, by default all situation events are forwarded to the specified event server. However, you can customize which situation events are forwarded and to which event server, using the Situation editor in the Tivoli Enterprise Portal. You may also need to assign an event status compatible with the target event server. For information on specifying which situation events to forward, see the Tivoli Enterprise Portal online help and the IBM Tivoli Monitoring: User's Guide.
OMEGAMON XE on z/OS now provides the following address space resource and storage information, previously available only in OMEGAMON for MVS:
The corresponding attributes have been added to the Address Space Real Storage attribute group.
The OMEGAMON XE on z/OS Address Space Overview workspace has been restructured to improve consolidate and update workspace content, and a new Address Space Details for Job workspace has been introduced.
The Selected Execution States view that shows bottleneck data has been eliminated to improve the overall workspace performance. Bottleneck data can still be accessed through a link from the Address Space Counts view.
The CPU Usage view has been expanded to include enclave CPU usage.
The data in the Address Space CPU Utilization Summary view is now returned by the agent in descending CPU percentage order by the agent. You can now see quickly which address spaces are using the most CPU, even when the result set is split across multiple pages within the view.
Several of the views in this workspace have been re-arranged.
Dynamic linking from OMEGAMON XE to its OMEGAMON for MVS interface is made possible by the support for dynamic terminal integration available with IBM Tivoli Monitoring V6.2.1. Dynamic terminal integration is an extension to the Tivoli Enterprise Portal that provides seamless access to TN3270-based applications through context-sensitive links.
A Tivoli Enterprise Portal terminal view enables you to connect to any TN3270, TN5250, or VT100 host system with TCP/IP from inside a Tivoli Enterprise Portal workspace. For 3270 or 5250 terminal views, you also have scripting capability with record, playback, and authoring of entire scripts. By associating a terminal view with a connection script and a query that returns appropriate values, you can configure a view that opens to a specific panel of a 3270 application. This feature is useful for creating contextual workspace links for investigating issues.
OMEGAMON XE on z/OS has taken advantage of this capability to create predefined links from several workspaces to target workspaces that contain a related OMEGAMON for MVS screen in a Terminal Emulator view. The data used to connect to the target screens is retrieved from environmental variables specified during configuration of OMEGAMON XE on z/OS monitoring agents using the Configuration Tool.
OMEGAMON XE on z/OS provides seven launch points to four target workspaces:
Like the predefined situations provided with the product, these predefined links are intended as examples that you can build on to create your own links, using instructions found in the Tivoli Enterprise Portal help.
Because of the large DASD volume counts that have become common in recent years, monitoring DASD devices without a filter that eliminates some of the devices can lead to high CPU or storage problems and even cause the monitoring server to fail. Consequently, the behavior of OMEGAMON XE on z/OS has been modified so that it does not collect DASD device data unless a DASD Filter is active. An auto-started warning situation (KM5_No_Sysplex_DASD_Filter_Warn) notifies you if no filtering situation is in place and no devices are being monitored.
You can turn DASD data collection on by running a DASD filter situation. OMEGAMON XE on z/OS includes a model filter situation (KM5_Model_Sysplex_DASD_Filter), which uses the DASD Device Collection Filtering attributes Average Response Time and I/O Rate. By customizing this filter to exclude well behaved devices, you can enable monitoring of devices of particular interest and avoid being overwhelmed with unwanted data. A third product-provided situation, KM5_Weak_Plex_DASD_Filter_Warn, alerts you when too many devices are being monitored and filter criteria should be strengthened.
See IBM Tivoli OMEGAMON XE on z/OS: User's Guide for instructions on creating a DASD filter situation.
OMEGAMON XE on z/OS includes reports that run under Tivoli Common Reporting, a reporting tool and strategy common across Tivoli products. Tivoli Common Reporting provides a consistent approach to viewing and administering reports. This reporting environment runs on Windows, Linux, and UNIX. For more information about Tivoli Common Reporting platforms, refer to the Tivoli Common Reporting: User's Guide.
For more information about the OMEGAMON XE on z/OS reports, see IBM Tivoli OMEGAMON XE on z/OS: User's Guide.
The product provides integration with Log Analyzer. Log Analyzer is a diagnostic tool included in the IBM Support Assistant (ISA) that links messages that are displayed in logs to information regarding workarounds and solutions that are available as APARs (Authorized Problem Analysis Reports). See the IBM Tivoli OMEGAMON XE on z/OS: Troubleshooting Guide for more information regarding ISA and the Log Analyzer.
Interim Feature 1 (IF1) offers enhanced reporting of common storage and storage shortages at the system and address space levels. In addition, the calculation of the CPU Loop Index attribute, introduced in V4.2.0, has been adjusted to reduce or eliminate false positive indications; new statistics have been introduced for HiperDispatch Management and LPAR group unused capacity; and reporting of work-dependent enclaves has been added. Altogether, 7 new workspaces, 3 new attribute groups, 9 new situations are introduced in IF1, and 17 attributes have been added to 7 existing attribute groups.
A new attribute group, KM5 Common Storage SubKey (KM5COMSTGSK), provides the common storage area information, which includes the number of bytes used and allocated for subpools that are not in fetch-protected memory. The information collected by these attributes is reported in a new Common Storage - Subpools workspace. The Storage Used Percent Area column in the Totals view of the workspace displays a Warning indicator when the percentage used exceeds 90% and a Critical indicator when the percentage exceeds 95%. A new situation, KM5_Avail_CSA_Warning, can alert you when free storage is low.
A new attribute group, KM5 Address Space Storage SubKey (KM5ASSTGSK), provides LSQA (Local System Queue Area) and ELSQA (Extended Local System Queue Area) storage usage information. The information collected by these attributes is reported in the Address Space Storage - Subpools and LSQA workspace.
Because reporting is resource intensive, history recording for this data is available only for address spaces that are being monitored by a running situation. Monitored address spaces are displayed in the Address Space Storage - Subpools and LSQA: Monitored Address Spaces workspace. A new situation, KM5_Job_Subp_Key_Use_Warning, can alert you when utilization of a specified storage subpool and key by any of a similarly named group of address spaces exceeds a target threshold. KM5_Job_Avail_LSQA_Warning can alert you when available LSQA is limited. You can use wildcards in these situations.
See the IBM Tivoli OMEGAMON XE on z/OS: OMEGAMON for MVS User's Guide for information on the new LSQA minor command.
The value reported for Total Size of CSA (Common Storage Area) and ECSA (Extended Common Storage Area) reflects reductions due to SQA and ESQA overflow. A new attribute, Initial Size, provides the initial size of CSA and ECSA at IPL. This attribute has been added to the Common Storage (COMSTOR) attribute group and data from it is reported in the Common Storage workspace.
A new Frames Used Percent attribute reports real storage frame usage as a percentage of all frames. This attribute, added to the Real Storage (REALSTOR) attribute group, is reported in the Real Storage workspace. In the Summary view, the Frames Used Percent column displays a Warning indicator when the percentage of used frames exceeds 90%, and a Critical indictor when the percentage exceeds 95%.
A new attribute group, KM5 Storage Shortage Status (KM5STGSTAT), provides information about storage shortage alerts raised on a given system or LPAR. Three new workspaces report information collected by these attributes: Storage Shortage Alerts, Storage Shortage Alerts Details, and Storage Shortage Alerts Trends. In the Summary view of the Storage Shortage Alerts workspace, the Storage Shortage Level column displays a Warning indicator for warning levels and a Critical indicator for critical levels.
Two new situations, KM5_Storage_Shortage_Warning and KM5_Storage_Shortage_Critical, alert you to storage shortages.
A key performance measurement, parked time, has been added to HiperDispatch reporting. This HiperDispatch-specific measurement is the time that discretionary CPU resources (Low share processors) in an LPAR spend not considered for CPU dispatching.
In analyzing the performance of an LPAR in HiperDispatch-mode, two views of CPU resource consumption are important: the LPAR view of CPU consumption and the z/OS view of CPU consumption. The z/OS view takes into account the parked time of each CPU when calculating the CPU Busy Percentage.
The LPAR CPU utilization information that is currently displayed by OMEGAMON XE on z/OS provides the LPAR view of consumption. Using this information, you can evaluate how much physical CPU resource of the LPAR's share of the total Central Processing Complex (CPC) is being consumed.
The statistics introduced in IF1 provide the z/OS view of the CPU consumption by subtracting the CPU's parked time from the numerator and denominator of the CPU Percent Busy calculation. A resulting high value percentage is usually indicative of latent demand from the LPAR, an important indicator for performance analysis purposes. In addition, the percentage of time during a reporting interval that the CPU was parked is displayed as Parked Pct, and reporting now includes the LPAR-wide LPAR Busy Percentage and MVS Busy percentages.
Two new attributes, System MVS Pct and System PCPD Pct have been added to the HiperDispatch Management (VCMLPAR) attribute group. Two additional attributes, MVS Pct and Parked Pct have been added to the HiperDispatch Logical Processors (VCMLCPU) attribute group. In the HiperDispatch workspace, two new bar chart views have been added for each processor type, which graphically report high and low percentages.
A new situation, KM5_HDSP_MVS_Pct_Busy_Warning, issues a warning alert when overall System MVS Percent utilization exceeds the specified threshold on a system running with HiperDispatch Management enabled. On such a system, with Logical Processor Parked Time accounted for in the calculation of MVS Percent utilization, high overall System MVS Percent utilization can be an indication of latent demand.
See the IBM Tivoli OMEGAMON XE on z/OS: OMEGAMON for MVS User's Guide for information on changes to the HDSP command that provides the new HiperDispatch statistics.
Two strategies have been used in IF1 to reduce the possibility of false positive indications.
The first strategy applies to address spaces. The loop index for each address space is now calculated over a longer time base. Permitting the workload to have its loop index calculated over longer time periods avoids spiky indications. In addition, you can extend the time of calculation for any workload by using a longer refresh interval in your situations.
The second strategy applies to low importance workloads. When the LPAR is running very busy, low importance work can be allowed to linger waiting for CPU for relatively long periods. These long wait times can lead to false positive loop indications. The strategy taken in this case is to identify low importance work and increase the period of calculation for that work so that it must exhibit looping behavior for a longer period before a warning is raised.
Work is determined to be of low importance if the CPU wait count exceeds CPU using count. Two levels of importance are considered: Level one is for work where the CPU Wait counts are between 50% and 75% of the total loop counts. Approximately 1 hour of history is used to calculate the CPU Loop Index for this work. If the work appears to be looping for that much time, an alert is raised. The second level is where the CPU Wait Counts are more than 75% of the total loop counts. This work is even less important and therefore more likely to have significant periods of CPU wait without being in a true loop. Approximately 2 hours are used to calculate the CPU Loop Index for this work.
The net effect of these changes is that:
Five new attributes have been added to the System CPU Utilization attribute group:
Values for Group LPAR MSU Limit and Average Unused Group MSUs are available only for LPARs running z/OS V1.11 or above. Only Percent LPAR MSU Capacity is available for LPARs that are not members of an LPAR group. If the current LPAR is not a member of a group, Unavailable is displayed for the remaining attributes.
These new attributes make it possible to to determine how close an LPAR group is to implementing capping on its member LPARs, and to identify periods when LPAR Group Capping was in effect.
For example, the Average Unused Group MSUs, provides an indication of how close an LPAR Group is to implementing capping: the lower the value, the closer LPAR Group Capping is to being implemented. A value of zero or a negative value indicates that LPAR Group Capping is currently in effect.
Two new product-provided situations, KM5_LPAR_MSU_Warn and KM5_LPAR_Cap_Warn can alert you to capping of an LPAR when its 4-Hour Rolling Average exceeds the defined capacity, or the LPAR Group Capacity Limit is exceeded by the LPAR and the other members in its LPAR Group.
See the IBM Tivoli OMEGAMON XE on z/OS: OMEGAMON for MVS User's Guide for information on changes to the RMSU minor command that provides the capacity statistics.
Work-dependent enclaves are an extension of an independent enclave. An independent enclave can create one or more work-dependent enclaves, all of which inherit the WLM (Workload Manager) characteristics of the independent enclave, including the owning address space of the independent enclave as its owner.
Work-dependent enclaves can be created directly by a task or a dependent enclave. In both cases, the work-dependent enclave is converted to a dependent enclave.
Two new attributes, Independent Enclave Token and Number of WorkDependent Enclaves, have been added to the Enclave Detail and Enclave Table attribute groups. Data collected by these attributes is reported in the enclave related workspaces. A new type, WorkDep, has been added to the Type attribute in both groups.
Two additional attributes, WorkDependent Active Enclave Count and WorkDependent Inactive Enclave Count, have been added to the Address Space CPU Utilization attribute group. Data collected by these attributes is reported in the following workspaces: Address Space CPU Usage Class and Period, Address Space Owning Selected Enclave, z/OS System Overview, Address Space CPU Usage Enclaves, and Enclave Information.
Wildcards can be used in situations that are used to drive the Address Space Storage Subpools and LSQA data collection. For example, the address space name might be specified as abc* to collect on any job whose name begins with abc. *abc would match any job containing the abc string.
In z/OS V1.11, the UNIX kernel exposes the zFS address space name. OMEGAMON XE on z/OS now uses the assigned name in reporting USS statistics instead of defaulting to zFS.
z/OS V1.11 now supports systems with up to 99 CPs on an LPAR and OMEGAMON XE on z/OS has been enhanced to provide base support for greater than 64 CPs .