This chapter describes the predefined situations of the product.
A situation is a logical expression involving one or more system conditions. Tivoli OMEGAMON XE for Storage on z/OS uses situations to monitor the systems in your network. To improve the speed with which you begin using Tivoli OMEGAMON XE for Storage on z/OS, the product provides situations that check for system conditions common to many enterprises. You can examine and if necessary, change the conditions or values being monitored to those best suited to your enterprise. Be sure to start the situations that you want to run in your environment.
You manage situations from the Tivoli management portal using the Situation editor. Using the Situation editor you can perform the following tasks:
When you open the Situation editor, the left frame initially lists the situations associated with the Navigator item you selected. When you click a situation name or create a new situation, the right frame of the Situation editor opens to provide the following information about the situation and allow you to further define that situation:
You can also enter Take Action commands, add a Take Action view to a workspace, select Take Action from the pop-up menu for an item in the Navigator's physical view, or create take action commands and save them for later use.
You can also specify a Storage Toolkit request to be run when a situation becomes true, and OMEGAMON Storage for z/OS is installed and a storage table is enabled for Storage Toolkit commands.
The following predefined situations are included in the Tivoli OMEGAMON XE for Storage on z/OS product.
If VALUE S3_Application_Monitoring.High_Dataset_MSR GE 50Monitors the response time components to determine the reason for a poor response time when an application is accessing a data set and the response time is greater than the critical threshold. Also examine the volume for over-utilization, cache settings, and the response time components at the volume level.
If VALUE S3_Application_Monitoring.High_Dataset_MSR GE 40 AND VALUE S3_Application_Monitoring.High_Dataset_MSR LT 50Monitors the response time components to determine the reason for a poor response time when an application is accessing a data set and the response time is greater than the warning threshold. Also examine the volume for over-utilization, cache settings, and the response time components at the volume level.
If VALUE S3_Cache_Control_Unit.Cache_Status NE ActiveMonitors for the condition where caching is not active for the control unit. Use the SETCACHE command to activate caching, if appropriate.
If VALUE S3_Cache_Control_Unit.DFW_Retry_Percent GE 2Monitors for the condition where the percent of DASD fast write attempts that cannot be satisfied because a shortage of available nonvolatile storage (NVS) space exceeds the critical threshold. Check for pinned NVS and correct the problem if NVS is pinned. Otherwise, if the impact on performance is not acceptable, you need to move a volume or data set to another cache control unit or add NVS to this control unit.
If VALUE S3_Cache_Control_Unit.DFW_Retry_Percent GE 1 AND VALUE S3_Cache_Control_Unit.DFW_Retry_Percent LT 2Monitors for the condition where the percent of DASD fast write attempts that cannot be satisfied because a shortage of available nonvolatile storage (NVS) space has exceeded the warning threshold. Check for pinned NVS and correct the problem if NVS is pinned. Otherwise, if the impact on performance is not acceptable, move a volume or data set to another cache control unit or add NVS to this control unit.
If VALUE S3_Cache_Control_Unit.Deactivated_Volumes GE 15Monitors for the condition where the number of deactivated volumes on the control unit exceeds the critical threshold. You can use the SETCACHE command to activate caching on the volumes, if necessary.
If VALUE S3_Cache_Control_Unit.Deactivated_Volumes GE 10 AND VALUE S3_Cache_Control_Unit.Deactivated_Volumes LT 15Monitors for the condition where the number of deactivated volumes on the control unit exceeds the warning threshold. You can use the SETCACHE command to activate caching on the volumes, if necessary.
If Value_S3_Cache_Control_Unit.NVS_Status NE ActiveMonitors for the condition where nonvolatile storage is not active for the control unit. All writes to volumes on the control unit are written directly to the hard disk drive. Use the SETCACHE command to activate NVS (nonvolatile storage), if appropriate.
If VALUE S3_Cache_Control_Unit.Read_Hit_Percent LE 50 AND VALUE S3_Cache_Control_Unit.Read_Hit_Percent GT 0Monitors for the condition where the percent of read I/O requests resolved from cache has fallen below the critical threshold. If performance is a problem, look for volume with a low read hit percent and consider moving them to another control unit to balance the load. This condition can be caused by cache-unfriendly applications or a shortage of cache.
If VALUE S3_Cache_Control_Unit.Read_Hit_Percent LE 60 AND VALUE S3_Cache_Control_Unit.Read_Hit_Percent GT 50Monitors for the condition where the percent of read I/O requests resolved from cache has fallen below the warning threshold. If performance is a problem, look for volume with a low read hit percent and consider moving them to another control unit to balance the load. This condition can be caused by cache-unfriendly applications or a shortage of cache.
If VALUE S3_Cache_Control_Unit.Track_Destaging_Rate GE 70Monitors for the condition where the rate at which tracks are being removed from cache and written to DASD exceeds the critical threshold. If performance is being impacted, you need to migrate data sets or volumes to another cache control unit. An alternative is to increase the cache capacity.
If VALUE S3_Cache_Control_Unit.Track_Destaging_Rate GE 50 AND VALUE S3_Cache_Control_Unit.Track_Destaging_Rate LT 70Monitors for the condition where the rate at which tracks are being removed from cache and written to DASD exceeds the warning threshold. If performance is being impacted, you need to migrate data sets or volumes to another cache control unit. An alternative is to increase the cache capacity.
If VALUE S3_Cache_Control_Unit.Track_Staging_Rate GE 70Monitors for the condition where the movement of tracks from the physical device to cache has exceeded the critical threshold. If performance is impacted, you might need to move the logical volume that is causing the excessive activity or to move data sets on the logical volume.
If VALUE S3_Cache_Control_Unit.Track_Staging_Rate GE 50 AND VALUE S3_Cache_Control_Unit.Track_Staging_Rate LT 70Monitors for the condition where the movement of tracks from the physical device to cache has exceeded the warning threshold. If performance is impacted, you might need to move the logical volume that is causing the excessive activity or to move data sets on the logical volume.
If VALUE S3_Cache_Control_Unit.Write_Hit_Percent LE 45 AND VALUE S3_Cache_Control_Unit.Write_Hit_Percent GE 0Monitors for the condition where the percent of DASD/Cache fast write commands that were successfully processed without accessing the volume is below the critical threshold. If performance is impacted you might need to move a volume or data set to another control unit to balance the workload.
If VALUE S3_Cache_Control_Unit.Write_Hit_Percent LE 50 AND VALUE S3_Cache_Control_Unit.Write_Hit_Percent GT 45Monitors for the condition where the percent of DASD/Cache fast write commands that were successfully processed without accessing the volume is below the warning level. If performance is impacted you might need to move a volume or data set to another control unit to balance the workload.
If VALUE S3_Channel_Path.Complex_Percent_Utilized GE 85Monitors high response time for I/O requests to volumes being serviced by the channel due to over utilization of that channel. You might need to balance the workload between channels by moving volumes or data sets.
If VALUE S3_Channel_Path.Complex_Percent_Utilized GE 70 AND VALUE S3_Channel_Path.Complex_Percent_Utilized LT 85Monitors high response time for I/O requests to volumes being serviced by the channel due to over utilization of that channel. You might need to balance the workload between channels by moving volumes or data sets.
If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND VALUE S3_HSM_Function_Summary.Function EQ BackupMonitors the HSM backup function to see if it is being held. If the hold is inadvertent, issue the HSM RELEASE BACKUP command to allow the backup function to continue processing.
If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 50 AND VALUE S3_HSM_Function_Summary.Function EQ BackupMonitors the HSM backup queue for a condition where the number of backup requests waiting exceeds the critical threshold. If the number of backup tasks is not at the maximum, issue the HSM SETSYS MAXBACKUPTASKS command to increase the number of backup tasks, thus increasing the processing rate. Keep in mind that the number of available backup volumes serves as a constraint on the number of active backup tasks.
If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 15 AND VALUE S3_HSM_Function_Summary.Waiting_Requests LT 50 AND VALUE S3_HSM_Function_Summary.Function EQ BackupMonitors the HSM backup queue for a condition where the number of backup requests waiting exceeds the warning threshold. If the number of backup tasks is not at the maximum, issue the HSM SETSYS MAXBACKUPTASKS command to increase the number of backup tasks, thus increasing the processing rate. Keep in mind that the number of available backup volumes serves as a constraint on the number of active backup tasks.
If VALUE S3_HSM_CRQplex.Element_Percent_Full GT 80Monitors the percentage of elements on the Common Recall Queue that are currently in use. HSM throttles the use of the CRQ when the percent utilized reaches 95%. To expand the CRQ structure, issue the SETXCF START,ALTER command.
If VALUE S3_HSM_Cross_System_CRQplex.Entry_Percent_Full GT 80Monitors the percentage of entries on the Common Recall Queue that are currently in use. HSM throttles the use of the CRQ when the percent utilized reaches 95%. To expand the CRQ structure, issue the SETXCF START,ALTER command.
If VALUE S3_HSM_Cross_System_CRQ_Hosts.HSM_Host_CRQ_State NE Connected AND VALUE S3_HSM_Cross_System_CRQ_Hosts.CRQplex_Base_Name NE n/aMonitors the state of the host in regards to the Common Recall Queue. To connect an HSM host to the CRQ, issue the HSM SETSYS command.
If VALUE S3_HSM_Cross_System_CRQplex.HSM_Hosts_Not_Connected GT 0Monitors the number of HSM hosts currently not connected to the Common Recall Queue.
If VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Held EQ YesMonitors the commonqueue status for this host. This condition can occur if the HOLD COMMONQUEUE command has been issued. To resolve this condition, issue a RELEASE COMMONQUEUE command.
If VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Place_Held EQ Internal OR VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Place_Held EQ External OR VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Place_Held EQ BothMonitors the commonqueue status for this host and whether requests can be placed on the common recall queue. This condition can occur if the HOLD COMMONQUEUE(RECALL(PLACEMENT)) command has been issued or inferred because a HOLD COMMONQUEUE or HOLD COMMONQUEUE(RECALL) was issued. To resolve this condition, issue a RELEASE COMMONQUEUE(RECALL(PLACEMENT)) command.
If VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Held EQ Internal OR VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Held EQ External OR VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Held EQ BothMonitors the commonqueue status for this host and whether requests can be recalled from the common recall queue. This condition can occur if the HOLD COMMONQUEUE(RECALL) command has been issued or inferred because a HOLD COMMONQUEUE was issued. To resolve this condition, issue a RELEASE COMMONQUEUE(RECALL) command.
If VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Select_Held EQ Internal OR VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Select_Held EQ External OR VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Select_Held EQ BothMonitors the commonqueue status for this host and whether requests can be pulled from the common recall queue. This condition can occur if the HOLD COMMONQUEUE(RECALL(SELECT)) command has been issued or inferred because a HOLD COMMONQUEUE or HOLD COMMONQUEUE(RECALL) was issued. To resolve this condition, issue a RELEASE COMMONQUEUE(RECALL(SELECT)) command.
If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND VALUE S3_HSM_Function_Summary.Function EQ DumpMonitors the HSM dump function to see if it is being held. If the hold is inadvertent, issue the HSM RELEASE DUMP command to allow dump processing to continue.
If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND VALUE S3_HSM_Function_Summary.Function EQ DumpMonitors the HSM dump queue for a condition where the number of dump requests waiting exceeds the critical threshold. If the number of dump tasks is not at the maximum, use the HSM SETSYS MAXDUMPTASKS command to increase the number of dump tasks, thus increasing the processing rate. Keep in mind that the number of available tape drives serves as a constraint on the number of active dump tasks.
If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 15 AND VALUE S3_HSM_Function_Summary.Function EQ Dump AND VALUE S3_HSM_Function_Summary.Waiting_Requests LT 50Monitors the HSM dump queue for a condition where the number of dump requests waiting exceeds the warning threshold. If the number of dump tasks is not at the maximum, use the HSM SETSYS MAXDUMPTASKS command to increase the number of dump tasks, thus increasing the processing rate. Keep in mind that the number of available tape drives serves as a constraint on the number of active dump tasks.
If VALUE S3_HSM_Status.Inactive_HSM_Hosts GT 0Monitors when an inactive HSM host has been detected. The event workspace for this situation has a link to the DFSMShsm Host Details workspace.
If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND VALUE S3_HSM_Function_Summary.Function EQ MigrationMonitors the migrate function to see if it is being held. If the hold on the function is inadvertent, issue the HSM RELEASE MIGRATION command to allow migration to continue.
If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 50 AND VALUE S3_HSM_Function_Summary.Function EQ MigrationMonitors the HSM migration queue for a condition where the number of migration requests waiting exceeds the critical threshold. If the number of migrate tasks is not at the maximum, use the HSM SETSYS MAXMIGRATIONTASKS command to increase the number of migration tasks, thus increasing the processing rate. Note that this affects only those migrations requested by automatic functions. Only one task is available to process command migration requests.
If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 15 AND VALUE S3_HSM_Function_Summary.Waiting_Requests LT 50 AND VALUE S3_HSM_Function_Summary.Function EQ MigrationMonitors the HSM migration queue for a condition where the number of migration requests waiting exceeds the warning threshold. If the number of migrate tasks is not at the maximum, use the HSM SETSYS MAXMIGRATIONTASKS command to increase the number of migration tasks, thus increasing the processing rate. Note that this affects only those migrations requested by automatic functions. Only one task is available to process command migration requests.
If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND VALUE S3_HSM_Function_Summary.Function EQ RecallMonitors the recall function to see if it is being held. If the hold on the function is inadvertent, issue the HSM RELEASE RECALL command to allow recalls to resume.
If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 50 AND VALUE S3_HSM_Function_Summary.Function EQ RecallMonitors the HSM recall queue for a condition where the number of recall requests waiting exceeds the critical threshold. If the number of recall tasks is not at the maximum, use the HSM SETSYS MAXRECAL LTASKS command to increase the number of recall tasks, thus increasing the processing rate.
If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 15 AND VALUE S3_HSM_Function_Summary.Waiting_Requests LT 50 AND VALUE S3_HSM_Function_Summary.Function EQ RecallMonitors the HSM recall queue for a condition where the number of recall requests waiting exceeds the critical threshold. If the number of recall tasks is not at the maximum, use the HSM SETSYS MAXRECAL LTASKS command to increase the number of recall tasks, thus increasing the processing rate.
If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND VALUE S3_HSM_Function_Summary.Function EQ RecoveryMonitors the recovery function to see if it is being held. If the hold on the function is inadvertent, issue the HSM RELEASE RECOVER command to allow recovery function to resume.
If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 50 AND VALUE S3_HSM_Function_Summary.Function EQ RecoveryMonitors the HSM recovery queue for a condition where the number of recover requests waiting exceeds the critical threshold. If the number of recovery tasks is not at the maximum, use the HSM SETSYS MAXDSRECOVERTASKS command to increase the number of recover tasks, thus increasing the processing rate. Keep in mind that the number of backup tape cartridges serves as a constraint on the number of active recovery tasks.
If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 15 AND VALUE S3_HSM_Function_Summary.Waiting_Requests LT 50 AND VALUE S3_HSM_Function_Summary.Function EQ RecoveryMonitors the HSM recovery queue for a condition where the number of recover tasks waiting exceeds the warning threshold. If the number of recovery tasks is not at the maximum, use the HSM SETSYS MAXDSRECOVERTASKS command to increase the number of recover tasks, thus increasing the processing rate. Keep in mind that the number of backup tape cartridges serves as a constraint on the number of active recovery tasks.
If VALUE S3_HSM_Status.HSM_Status EQ InActiveMonitors the status of the HSM. If status is not active, restart HSM.
If VALUE S3_Logical_Control_Unit.Average_Delay_Queue GE 0.500Monitors for the condition where the average number of requests queued to devices assigned to a logical control unit due to busy conditions on physical paths has exceeded the critical threshold. If performance is impacted, you might be able to balance the workload across multiple LCUs by moving a volume or data set. Otherwise, you need to add physical paths to the LCU.
If VALUE S3_Logical_Control_Unit.Average_Delay_Queue GE 0.2 AND VALUE S3_Logical_Control_Unit.Average_Delay_Queue LT 0.500Monitors for the condition where the average number of requests queued to devices assigned to a logical control unit due to busy conditions on physical paths has exceeded the warning threshold. If performance is impacted, you might be able to balance the workload across multiple LCUs by moving a volume or data set. Otherwise, you need to add physical paths to the LCU.
If VALUE S3_Logical_Control_Unit.Contention_Rate GE 1.001Monitors for the condition where the rate at which I/O requests are being queued to devices on a logical control unit (LCU) due to busy conditions on physical paths has exceeded the critical threshold. If performance is impacted, you need to migrate volumes or data sets to another LCU, otherwise, you need to add physical paths to the LCU.
If VALUE S3_Logical_Control_Unit.Contention_Rate GE 0.2 AND VALUE S3_Logical_Control_Unit.Contention_Rate LT 1.001Monitors for the condition where the rate at which I/O requests are being queued to devices on a logical control unit (LCU) due to busy conditions on physical paths has exceeded the warning threshold. If performance is impacted, you need to migrate volumes or data sets to another LCU, otherwise, you need to add physical paths to the LCU.
If VALUE S3_Logical_Control_Unit.Channel_Path_I/O_Rate GE 600Monitors for the condition where the I/O rate per second to volumes in the logical control unit (LCU) has exceeded the critical threshold. If performance is impacted, you need to balance the workload across multiple LCUs by moving volumes or data sets.
If VALUE S3_Logical_Control_Unit.Channel_Path_I/O_Rate GE 200 AND VALUE S3_Logical_Control_Unit.Channel_Path_I/O_Rate LT 600Monitors for the condition where the I/O rate per second to volumes in the logical control unit (LCU) has exceeded the warning threshold. If performance is impacted, you need to balance the workload across multiple LCUs by moving volumes or data sets.
If VALUE S3_RMM_Control_Dataset.Days_Since_Last_Backup GT 3The number of days since the last backup of the DFSMSrmm CDS or Journal exceeded the critical threshold.
If VALUE S3_RMM_Control_Dataset.Days_Since_Last_Backup GT 1 AND VALUE S3_RMM_Control_Dataset.Days_Since_Last_Backup LE 3The number of days since the last backup of the DFSMSrmm CDS or Journal exceeded the warning threshold.
If VALUE S3_RMM_Control_Dataset.RMM_Percent_Used GT 90The percentage of space used by the DFSMSrmm CDS or Journal is greater than the critical threshold.
If VALUE S3_RMM_Control_Dataset.RMM_Percent_Used GE 80 AND VALUE S3_RMM_Control_Dataset.RMM_Percent_Used LE 90The percentage of space used by the DFSMSrmm CDS or Journal is greater than the warning threshold.
If ( ( VALUE S3_RMM_Config.EDGUX200_Status NE Enabled ) OR ( VALUE S3_RMM_Config.EDGUX100_Status NE Enabled ) )The DFSMSrmm EDGUX100 or EDGUX200 exit is not Enabled.
If VALUE S3_RMM_Config.Journal_Status NE EnabledThe DFSMSrmm Journal is either Disabled or Locked. DFSMSrmm does not allow further updates to the journal until BACKUP is run to back up the DFSMSrmm control data set and to clear the journal. If the Journal is Locked, DFSMSrmm fails any requests that result in an update to the DFSMSrmm control data set. Message EDG2103D might also have been issued to the DFSMSrmm operator console.
If VALUE S3_RMM_Config.Operating_Mode NE ProtectDFSMSrmm is not operating in Protect mode. Certain actions that should be rejected are permitted if DFSMSrmm is not operating in protect mode, for example attempting to read a scratch tape volume.
If VALUE S3_RMM_Summary.Type EQ 0 AND VALUE S3_RMM_Summary.Scratch_Volumes LT 100The number of Scratch volumes is below the critical threshold.
If VALUE S3_RMM_Summary.Type EQ 0 AND VALUE S3_RMM_Summary.Scratch_Volumes LT 200 AND VALUE S3_RMM_Summary.Scratch_Volumes GE 100The number of Scratch volumes is below the warning threshold.
If VALUE S3_RMM_Config.Subsystem_Status EQ InactiveThe DFSMSrmm subsystem is inactive.
If VALUE S3_Storage_Toolkit_Result_Summary.Return_Code GT 4The batch job submitted by the Storage Toolkit to execute a command or user-defined JCL returns a value greater than 4. Or the Storage Toolkit encountered an error while attempting to process a command or user-defined JCL. A value that is greater than 4, and is not specific to the Storage Toolkit, typically denotes that a command failed to complete. If you elected to save the results of the batch job, go to the Storage Toolkit Result Detail workspace to determine whether the error requires further attention.
Values set by the Storage Toolkit when it detects an error while processing a command or user-defined JCL are described in the "Storage Toolkit limitations and hints" topic of the OMEGAMON XE for Storage on z/OS User's Guide (see the description of return codes). If one of these values is returned, also consult the RKLVLOG for additional messages to help determine the cause of the failure.
If VALUE S3_Storage_Toolkit_Result_Summary.Return_Code EQ 4The batch job submitted by the Storage Toolkit to execute a command or user-defined JCL returns the value 4. A value of 4 typically denotes a warning. If you elected to save the results of the batch job, go to the Storage Toolkit Result Detail workspace to determine whether the warning requires further attention.
If VALUE S3_Volume_Group_Summary.Free_Space_Percent LT 5.0 AND VALUE S3_Volume_Group_Summary.Group_Type EQ SMSGROUP AND VALUE S3_Vol ume_Group_Summary.Free_Space_Percent GE 0.0Monitors the percentage of free space available for allocation in the storage group and detects when free space has dropped below the critical threshold. To prevent allocation failures, you might have to either add one or more logical volumes to the storage group, or to move data sets off of the logical volumes in the storage group.
If VALUE S3_Volume_Group_Summary.Free_Space_Percent LT 10.0 AND VALUE S3_Volume_Group_Summary.Group_Type EQ SMSGROUP AND VALUE S3_Volume_Group_Summary.Free_Space_Percent GE 5.0",Monitors the percentage of free space available for allocation in the storage group and detects when free space has dropped below the warning threshold. In order to prevent allocation failures, you might have to either add one or more logical volumes to the storage group, or to migrate data sets off of the logical volumes in the storage group.
If VALUE S3_TotalStorageDS_Array.RAID_Degraded EQ YesMonitors the arrays in a TotalStorageDS storage facility for a degraded condition where one or more arrays need rebuilding.
If VALUE S3_TotalStorageDS_Configuration.Number_of_arrays_with_problems GT 0Monitors for the condition where the number of arrays in the TotalStorageDS storage facility running degraded, throttled, or with an RPM exception exceeds the threshold. The RAID Degraded condition indicates that one or more DDMs in the array need rebuilding. The DDM Throttling condition indicates that a near-line DDM in the array is throttling performance due to temperature or workload. The RPM Exception condition indicates that a DDM with a slower RPM than the normal array DDMs is a member of the array as a result of a sparing action.
If VALUE S3_TotalStorageDS_Array.RPM_Exception EQ YesMonitors the arrays in a TotalStorageDS for a condition where a DDM with a slower RPM than the normal array DDMs is a member of the array as a result of a sparing action.
If VALUE S3_TotalStorageDS_Array.DDM_Throttling EQ YesMonitors the arrays in a TotalStorageDS for a condition where the array is throttling performance due to overload or temperature.
If VALUE S3_TotalStorageDS_Extent_Pool.Number_of_arrays_with_problems GT 0Monitors for the condition where the number of arrays in the extent pool running degraded, throttled, or with an RPM exception exceeds the threshold. The RAID Degraded condition indicates that one or more DDMs in the array need rebuilding. The DDM Throttling condition indicates that a near-line DDM in the array is throttling performance due to temperature or workload. The RPM Exception condition indicates that a DDM with a slower RPM than the normal array DDMs is a member of the array as a result of a sparing action.
If VALUE S3_TotalStorageDS_Rank.Number_of_arrays_with_problems GT 0Monitors for the condition where the number of arrays in the rank running degraded, throttled, or with an RPM exception exceeds the threshold. The RAID Degraded condition indicates that one or more DDMs in the array need rebuilding. The DDM Throttling condition indicates that a near-line DDM in the array is throttling performance due to temperature or workload. The RPM Exception condition indicates that a DDM with a slower RPM than the normal array DDMs is a member of the array as a result of a sparing action.
If VALUE S3_Cache_Devices.DFW_Retry_Percent GE 2 AND VALUE S3_Cache_Devices.I/O_Count GE 25Monitors for the condition where the percentage of DASD fast write attempts for a volume that cannot be satisfied due to a shortage of available nonvolatile storage (NVS) space exceeded the critical threshold. Check for pinned NVS and correct the problem if NVS is pinned. Otherwise, if the impact on performance is not acceptable, move a volume or data set to another cache control unit or to add NVS to this control unit.
If VALUE S3_Cache_Devices.DFW_Retry_Percent GE 1 AND VALUE S3_Cache_Devices.DFW_Retry_Percent LT 2 AND VALUE S3_Cache_Devices.I/O_Count GE 25Monitors for the condition where the percentage of DASD fast write attempts for a volume that cannot be satisfied due to a shortage of available nonvolatile storage (NVS) space exceeded the warning threshold. Check for pinned NVS and correct the problem if NVS is pinned. Otherwise, if the impact on performance is not acceptable, move a volume or data set to another cache control unit or to add NVS to this control unit.
If VALUE S3_Cache_Devices.Read_Hit_Percent LE 45 AND VALUE S3_Cache_Devices.Read_Hit_Percent GE 0 AND VALUE S3_Cache_Devices.I/O_Count GE 25Monitors for the condition where the cache read hit percent is below the critical threshold. If performance is impacted determine the reason for the low read hit percent. Common problems are cache-unfriendly applications and over-utilization of the control unit.
If VALUE S3_Cache_Devices.Read_Hit_Percent LE 55 AND VALUE S3_Cache_Devices.Read_Hit_Percent GT 45 AND VALUE S3_Cache_Devices.I/O_Count GE 25Monitors for the condition where the cache read hit percent is below the warning threshold. If performance is impacted determine the reason for the low read hit percent. Common problems are cache-unfriendly applications and over-utilization of the control unit.
If VALUE S3_Cache_Devices.Write_Hit_Percent LE 20 AND VALUE S3_Cache_Devices.Write_Hit_Percent GE 0 AND VALUE S3_Cache_Devices.I/O_Count GE 25Monitors for the condition where the cache write hit percent for a volume is below the critical threshold. Check the status of the nonvolatile storage in the cache control unit. You can move volumes or data sets to balance the workload.
If VALUE S3_Cache_Devices.Write_Hit_Percent LE 30 AND VALUE S3_Cache_Devices.Write_Hit_Percent GT 20 AND VALUE S3_Cache_Devices.I/O_Count GE 25Monitors for the condition where the cache write hit percent for a volume is below the warning threshold. Check the status of the nonvolatile storage in the cache control unit. You can move volumes or data sets to balance the workload.
If VALUE S3_DASD_Volume_Space.VTOC_Index_Status EQ DisabledMonitors for the condition where a VTOC index has been disabled. This condition can degrade performance on the volume. Enable the VTOC index.
If VALUE S3_DASD_Volume_Space.Extended_Address_Volume EQ Yes AND VALUE S3_DASD_Volume_Space.Track_Managed_Fragmentation_Index GE 850The fragmentation index in the track managed area of an Extended Address Volume exceeds the critical threshold.
If VALUE S3_DASD_Volume_Space.Extended_Address_Volume EQ Yes AND VALUE S3_DASD_Volume_Space.Track_Managed_Fragmentation_Index GE 650 AND VALUE S3_DASD_Volume_Space.Track_Managed_Fragmentation_Index LT 850The fragmentation index in the track managed area of an Extended Address Volume exceeds the warning threshold.
If VALUE S3_DASD_Volume_Space.Track_Managed_Percent_Free LE 5.0 AND VALUE S3_DASD_Volume_Space.Track_Managed_Percent_Free GE 0.0 AND VALUE S3_DASD_Volume_Space.Extended_Address_Volume EQ YesThe percentage of free space in the track managed area of an Extended Address Volume is below the critical threshold.
If VALUE S3_DASD_Volume_Space.Track_Managed_Percent_Free LE 10.0 AND VALUE S3_DASD_Volume_Space.Track_Managed_Percent_Free GT 5.0 AND VALUE S3_DASD_Volume_Space.Extended_Address_Volume EQ YesThe percentage of free space in the track managed area of an Extended Address Volume is below the warning threshold.
If VALUE S3_DASD_Volume_Space.Fragmentation_Index GE 850Monitors for the condition where a volume has a fragmentation index that exceeds the critical threshold. Defragment the volume so that free extents are combined to help prevent data set allocation failures.
If VALUE S3_DASD_Volume_Space.Fragmentation_Index GE 650 AND VALUE S3_DASD_Volume_Space.Fragmentation_Index LT 850Monitors for the condition where a volume has a fragmentation index that exceeds the warning threshold. Defragment the volume so that free extents are combined to help prevent data set allocation failures.
If VALUE S3_DASD_Volume_Space.Percent_Free_Space LE 5 AND VALUE S3_DASD_Volume_Space.Percent_Free_Space GE 0Monitors for the condition where the percentage of free space on a volume is below the critical threshold. If data sets on the volume require more space, then either migrate some data sets to another volume or release space from data sets that might be over-allocated.
If VALUE S3_DASD_Volume_Space.Percent_Free_Space LE 10 AND VALUE S3_DASD_Volume_Space.Percent_Free_Space GT 5Monitors for the condition where the percentage of free space on a volume is below the critical threshold. If data sets on the volume require more space, then either migrate some data sets to another volume or release space from data sets that might be over-allocated.
If VALUE S3_DASD_Volume_Performance.Response_Time GE 55 AND VALUE S3_DASD_Volume_Performance.I/O_Count GE 25Monitors for the condition where response time for the volume exceeds the critical threshold. Look at the volume to see if high utilization is a problem. If so, it might be necessary to migrate data sets from the volume to reduce utilization. Also check the cache status of the volume. Look at the components of I/O to determine where the time is being spent and address the problem accordingly.
If VALUE S3_DASD_Volume_Performance.Response_Time GE 35 AND VALUE S3_DASD_Volume_Performance.Response_Time LT 55 AND VALUE S3_DASD_Volume_Performance.I/O_Count GE 25Monitors for the condition where response time for the volume exceeds the warning threshold. Look at the volume to see whether high utilization is a problem. If so, you can migrate data sets from the volume to reduce utilization. Also check the cache status of the volume. Look at the components of I/O to determine where the time is being spent and address the problem accordingly.
If VALUE S3_VTS_Overview.Virtual_Disconnect_Time GE 500Monitors for the condition where the logical control unit disconnect time for the virtual tape server exceeds the critical threshold. This condition is often an indication that the tape volume cache capacity is being exceeded.
If VALUE S3_VTS_Overview.Host_Channel_Activity_GB GE 18Monitors for the condition where the activity between the MVS™ system and the virtual tape server on the host channels exceeds 19 GB over the hour interval. This condition can be an indication that the virtual tape server is being overloaded.
If VTSTPVOLC.PCTCPT GT 50Monitors for the condition where copy is the predominant reason for throttling.
If VTSTPVOLC.PCTWROT GT 50Monitors for the condition where write overrun is the predominant reason for throttling.
If VALUE S3_VTS_Overview.Volume_Recall_Percent GE 20Monitors for the condition where the percent of virtual tape mounts that required a physical tape mount to be satisfied exceeded the warning threshold. This condition can lead to unacceptably large virtual mount times. If so, then investigate the reason for the recalls. If rescheduling or removing the application workload is not possible, you need to increase the cache capacity of the VTS.
If VALUE S3_VTS_Overview.Average_Virtual_Mount_Pend_Time GE 300Monitors for the condition where the average seconds required to satisfy a virtual mount in the virtual tape subsystem exceeded the warning threshold. If this condition persists, then further study is required to determine the cause for the elongated mount times. The condition might be due to VTS-hostile applications or to a shortage of VTS resources.
If VALUE S3_VTS_Overview.Maximum_Virtual_Mount_Pend_Time EQ 900Monitors for the condition where the maximum seconds required to satisfy a virtual mount in the virtual tape subsystem exceeded the warning threshold. If this condition persists, then further study is required to determine the cause for the elongated mount times. The condition might be due to VTS-hostile applications or to a shortage of VTS resources.