A layered product on the DYNIX/ptx® operating system, ptx/SESMON enables basic remote monitoring of peripheral devices that support the SCSI Enclosure Services (SES) protocol. Currently, the only device monitored by ptx/SESMON is the CLARiiON Disk Array Enclosure (DAE) subsystem, a ten-slot JBOD ("just a bunch of disks") array directly connected to a Fibre Channel Host Adapter of a NUMA-Q quad.
ptx/SESMON allows you to display the status of your DAE unit and its field-replaceable components by entering the sesmonid command at the DYNIX/ptx command line. In addition, ptx/SESMON provides a daemon, sesmond, that polls all SES enclosures attached to your NUMA-Q system at configurable intervals. When sesmond detects an error or its correction, it writes a message to the DYNIX/ptx Error Event Subsystem (EES) and/or to ktlog.
ptx/SESMON V1.1.1 supports multi-port connection between the DAE unit and the NUMA-Q host system.
ptx/SESMON supports only remote monitoring, not remote control, of the DAE subsystem.
ptx/SESMON V1.1.1 requires the following system environment:
DYNIX/ptx V4.5.1 or greater
A one- or two-quad NUMA-Q system with FC direct connection between the DAE storage subsystem and an Emulex® Firefly or Superfly Fibre Channel host adapter.
Emulex FC host adapter LP6000 (Firefly) (V2, V3) with firmware FF2.23, or LP7000 (Superfly) (V4) with firmware SF2.23.
In addition, we recommend that you install ptx/EES V1.1.0 or greater, in order to obtain the ptx/SESMON event messages in their most usable form.
The DAE storage subsystem and ptx/SESMON are not supported on Symmetry® systems.
The following problems have been fixed for ptx/SESMON V1.1.1.
Install ptx/SESMON with the procedures described in the DYNIX/ptx® V4.5.1 and Layered Products Software Installation Release Notes. For information on installing the DAE units that ptx/SESMON monitors, see the NUMA-Q Installation Guide for CLARiiON® Disk Arrays.
ptx/SESMON provides two categories of monitoring features. You can display the status of selected devices within selected DAE units by using the sesmonid command at the DYNIX/ptx system prompt. Independently of the sesmonid command, the daemon will poll all DAE units on your system and their component devices, by default. If you have installed ptx/EES, results of this polling will be written to the /var/ees/eeslog file on your system, as well as to ktlog.You can configure this polling process by editing the startup script for ptx/SESMON, /etc/init.d/sesmon (see Section 1.2.2, "The sesmond Daemon").
ATTENTION Be sure to stop all SES monitoring activity before configuring/deconfiguring a device on the host system with devctl or downloading disk firmware to the DAE unit with fwdl or another utilility with the same purpose. After these activities have been completed, restart the sesmond daemon by invoking the startup script:
# sh /etc/init.d/sesmon start
The sesmonid command displays the status of the DAE units connected to your host system, as well as the status of each of their components. sesmonid also indicates whether a component (such as a fan or disk drive) is installed.
Table 1-1 shows the possible sesmonid options and describes their effects:
sesmonid Argument |
Behavior of sesmonid |
No argument |
Displays status of all components within all DAE units connected to the host system. |
-
d device_name |
Displays status of all components within the DAE unit that contains the component called device_name. |
-x device_name |
Does not display or log status for DAE unit containing device_name. |
-X device_name |
Does not use device_name to retrieve data or monitoring information from the DAE unit. |
For example, suppose that you run sesmonid on a system with 2 DAE units connected over two Fibre Channel loops. You are interested in monitoring all devices within both units (the default sesmonid behavior), so you run sesmonid without arguments. You will see the following output if the DAE unit with the enclosure address 0 on Fibre Channel fabric 0 has only one of its two power supplies installed and contains a faulty disk drive in slot 6.
# sesmonid
Status of DAE on fabric0 with address 0 LCC version 003:
disk sd25 in slot 0 is installed
disk sd24 in slot 1 is installed
disk sd23 in slot 2 is installed
disk sd22 in slot 3 is installed
disk sd21 in slot 4 is installed
disk sd20 in slot 5 is installed
disk sd19 in slot 6 is installed disk fault
disk sd18 in slot 7 is installed
disk sd17 in slot 8 is installed
disk sd16 in slot 9 is installed
power supply A is installed
power supply B is not installed
cooling element is installed
LCC A is installed
LCC B is installed
Status of DAE on fabric1 with address 1 LCC version 003:
disk sd12 in slot 0 is installed
disk sd11 in slot 1 is installed
disk sd10 in slot 2 is installed
disk sd9 in slot 3 is installed
disk sd8 in slot 4 is installed
disk sd7 in slot 5 is installed
disk sd6 in slot 6 is installed
disk sd5 in slot 7 is installed
disk sd4 in slot 8 is installed
disk sd3 in slot 9 is installed
power supply A is installed
power supply B is installed
cooling element is installed
LCC A is installed
LCC B is installed
If the DAE with address 1 were dual-ported, its first status line might look like this:
Status of DAE on: fabric2 and fabric3 with address 1 LCC version 003:
By default, every 30 seconds the sesmond daemon polls all devices within all DAE units connected to your host system. You can modify the polling and reporting intervals by editing the ptx/SESMON startup script, /etc/init.d/sesmon.
ATTENTION You must stop the sesmond daemon and restart it after you have modified the /etc/init.d/sesmon file.
The following line of this script, edited as follows, changes the polling interval from the default 30 seconds to 60 seconds (-p 60). It first logs an event only after it has been detected for three polling periods (-t 3). From that point on, it logs critical messages at every polling period and logs noncritical messages every 120 polling periods(-i 120).
/bin/echo "Starting $SESNAME" if [ -f $SESMOND ]; then $SESMOND -p 60 -i 120 -t 3 fi
If you wanted to modify this polling arrangement to log noncritical messages only once, edit the code as follows:
/bin/echo "Starting $SESNAME" if [ -f $SESMOND ]; then $SESMOND -p 60 -i -1 -t 3 fiWhile it is possible to edit the /etc/init.d/sesmon script to restrict the DAE units and devices that ptx/SESMON monitors, you are usually better off not doing so. You probably want these exclusions only temporarily, whereas /etc/init.d/sesmon, once it is edited to exclude certain devices, will continue to make ptx/SESMON ignore them until you re-edit the script to undo the exclusions. Instead of editing the script, first halt the daemon:
# sh /etc/init.d/sesmon stopThen run /usr/bin/sesmond -X from the command line; as follows:
# /usr/bin/sesmond -X device_n -X device_min which device_n and device_m are the names of the devices you wish to exclude from polling. The next time /etc/init.d/sesmon starts up the daemon, sesmond will ignore these temporary exclusions and run according to the arguments contained in the startup script.
Table 1-2 summarizes all the possible sesmond arguments and gives recommendations as to whether each should be implemented in the startup script or at the command line.
ATTENTION When you start the sesmond daemon manually at the command line, type the full pathname /usr/bin/sesmond; otherwise, the daemon cannot be killed by the shutdown script, which only recognizes the process by the name /usr/bin/sesmond.
Command Argument with sesmond |
Effect of the Command Argument |
Recommended Implementation |
-d device |
Only monitor DAE units that include the disk named device. Devices specified using -d are selected before any exclusions. By default, all DAE units are monitored. |
Command line |
-x device |
Do not monitor the DAE unit that contains the disk named device. Do not monitor any other device within that DAE unit. |
Command line |
-X device |
Do not monitor the disk named device or include it in the path to monitor the DAE unit's status. Continue to monitor other devices within the same DAE unit. |
Command line |
-i message_interval |
Log noncritical faults each time that the number of polling periods designated by message_interval passes. For example, if the polling period is 30 seconds and the message_interval is 20, noncritical messages will be logged every 20 message intervals, that is, every 10 minutes. |
Edit /etc/init.d/sesmon |
-i -1 |
Log noncritical faults only once. |
Edit /etc/init.d/sesmon |
-p poll_interval |
Poll for status of the DAE and its devices every poll_interval seconds. The default value is 30 seconds. |
Edit /etc/init.d/sesmon |
-t threshold |
Log an event only after a failure has been seen for threshold polling periods. |
Edit /etc/init.d/sesmon |