Chapter 1
ptx/SESMON V1.0.1 Release Notes


1.1 About This Release


1.1.1 Introduction

A layered product on the DYNIX/ptx® operating system, ptx®/SESMON enables basic remote monitoring of peripheral devices that support the SCSI Enclosure Services (SES) protocol. Currently, the only device monitored by ptx/SESMON is the CLARiiON® Disk Array Enclosure (DAE) subsystem, a ten-slot JBOD ("just a bunch of disks") array directly connected to the Fibre Channel via an arbitrated loop.

ptx/SESMON allows you to display the status of your DAE unit and its field-replaceable components by entering the sesmonid command at the DYNIX/ptx command line. In addition, ptx/SESMON provides a daemon, sesmond, that polls all SES enclosures attached to your NUMA-QTM system at configurable intervals. When sesmond detects an error or its correction, it writes a message to the DYNIX/ptx Error Event Subsystem (EES) and/or to ktlog.

ptx/SESMON supports only remote monitoring, not remote control, of the DAE subsystem.

ptx/SESMON V1.0.1 supports the following disk drive models:


1.1.2 Changes Since the Last Release

ptx/SESMON V1.0.1 supports the following disk drive models:


1.1.3 Product Compatibility

ptx/SESMON V1.0.1requires the following system environment:

In addition,we recommend that you install ptx/EES V1.0.1 or greater, in order to obtain the ptx/SESMON event messages in their most usable form.

The DAE storage subsystem and ptx/SESMON are not supported on Symmetry® systems.


1.1.4 Installing ptx/SESMON

Install ptx/SESMON with the procedures described in the DYNIX/ptx V4.4.8 and Layered Products Software Installation Release Notes. For information on installing the DAE units that ptx/SESMON monitors, see the NUMA-Q Installation Guide for CLARiiON Disk Arrays.


1.2 Using ptx/SESMON to Monitor DAE Units

ptx/SESMON provides two categories of monitoring features. You can display the status of selected devices within selected DAE units by using the sesmonid command at the DYNIX/ptx system prompt. Independently of the sesmonid command, the daemon will poll all DAE units on your system and their component devices, by default. If you have installed ptx/EES, results of this polling will be written to the /var/ees/eeslog file on your system, as well as to ktlog.You can configure this polling process by editing the startup script for ptx/SESMON, /etc/initd/sesmon (see Section 1.2.2, "The sesmond Daemon").


ATTENTION

Be sure to stop all SES monitoring activity before configuring/deconfiguring a device on the host system with devctl or downloading disk firmware to the DAE unit with fwdl or another utilility with the same purpose. After these activities have been completed, restart the sesmond daemon by invoking the startup script:

# sh /etc/initd/sesmon start


1.2.1 The sesmonid Command

The sesmonid command displays the status of the DAE units connected to your host system, as well as the status of each of their components. sesmonid also indicates whether a component (such as a fan or disk drive) is installed.

Table 1-1 shows the possible sesmonid options and describes their effects:

Table 1-1. sesmonid Command Options

sesmonid Argument

Behavior of sesmonid

No argument

Displays status of all components within all DAE units connected to the host system.

-

d device_name

Displays status of all components within the DAE unit that contains the component called device_name.

-x device_name

Does not display or log status for DAE unit containing device_name.

-X device_name

Does not use device_name to retrieve data or monitoring information from the DAE unit.


For example, suppose that you run sesmonid on a system with 2 DAE units connected to it over two arbitrated loops. You are interested in monitoring all devices within both units (the default sesmonid behavior), so you run sesmonid without arguments. You will see the following output if the DAE unit with the enclosure address 0 on Fibre Channel fabric 0 has only one of its two power supplies installed and contains a faulty disk drive in slot 6.

# sesmonid 
Status of DAE on fabric0 with address 0 version 003:
disk sd25 in slot 0 is installed
disk sd24 in slot 1 is installed
disk sd23 in slot 2 is installed
disk sd22 in slot 3 is installed
disk sd21 in slot 4 is installed
disk sd20 in slot 5 is installed
disk sd19 in slot 6 is installed disk fault
disk sd18 in slot 7 is installed
disk sd17 in slot 8 is installed
disk sd16 in slot 9 is installed
power supply A is installed
power supply B is not installed
cooling element is installed
LCC A is installed
LCC B is installed
Status of DAE on fabric1 with address 1 version 003:
disk sd12 in slot 0 is installed
disk sd11 in slot 1 is installed
disk sd10 in slot 2 is installed
disk sd9 in slot 3 is installed
disk sd8 in slot 4 is installed
disk sd7 in slot 5 is installed
disk sd6 in slot 6 is installed
disk sd5 in slot 7 is installed
disk sd4 in slot 8 is installed
disk sd3 in slot 9 is installed
power supply A is installed
power supply B is installed
cooling element is installed
LCC A is installed
LCC B is installed

1.2.2 The sesmond Daemon

By default, every 30 seconds the sesmond daemon polls all devices within all DAE units connected to your host system. You can modify the polling and reporting intervals by editing the ptx/SESMON startup script, /etc/initd/sesmon.


ATTENTION

You must stop the sesmond daemon and restart it after you have modified the /etc/initd/sesmon file.


The following line of this script, edited as follows, changes the polling interval from the default 30 seconds to 60 seconds (-p 60). It first logs an event only after it has been detected for three polling periods (-t 3). From that point on, it logs critical messages at every polling period and logs noncritical messages every 120 polling periods (-i 120).

/bin/echo "Starting $SESNAME"
        if [ -f $SESMOND ]; then
                $SESMOND -p 60 -i 120 -t 3
        fi

If you wanted to modify this polling arrangement to log noncritical messages only once, edit the code as follows:

/bin/echo "Starting $SESNAME"
        if [ -f $SESMOND ]; then
                $SESMOND -p 60 -i -1 -t 3
        fi
While it is possible to edit the /etc/initd/sesmon script to restrict the DAE units and devices that ptx/SESMON monitors, you are usually better off not doing so. You probably want these exclusions only temporarily, whereas /etc/initd/sesmon, once it is edited to exclude certain devices, will continue to make ptx/SESMON ignore them until you re-edit the script to undo the exclusions. Instead of editing the script, first halt the daemon:
# sh /etc/initd/sesmon stop
Then run /usr/bin/sesmond -X from the command line; as follows:
# /usr/bin/sesmond -X device_n -X device_x
in which device_n and device_x are the names of the devices you wish to exclude from polling. The next time /etc/initd/sesmon starts up the daemon, sesmond will ignore these temporary exclusions and run according to the arguments contained in the startup script.

Table 1-2 summarizes all the possible sesmond arguments and gives recommendations as to whether each should be implemented in the startup script or at the command line.


ATTENTION

When you start the sesmond daemon manually at the command line, type the full pathname /usr/bin/sesmond; otherwise, the daemon cannot be killed by the shutdown script, which only recognizes the process by the name /usr/bin/sesmond.


Table 1-2. Implementations of sesmond Command Arguments

Command Argument with sesmond

Effect of the Command Argument

Recommended Implementation

-d device

Only monitor DAE units that include the disk named device. Devices specified using -d are selected before any exclusions. By default, all DAE units are monitored.

Command line

-x device

Do not monitor the DAE unit that contains the disk named device. Do not monitor any other device within that DAE unit.

Command line

-X device

Do not monitor the disk named device or include it in the path to monitor the DAE unit's status. Continue to monitor other devices within the same DAE unit.

Command line

-i message_interval

Log noncritical faults each time that the number of polling periods designated by message_interval passes. For example, if the polling period is 30 seconds and the message_interval is 20, noncritical messages will be logged every 20 message intervals, that is, every 10 minutes.

Edit /etc/initd/sesmon

-i -1

Log noncritical faults only once.

Edit /etc/initd/sesmon

-p poll_interval

Poll for status of the DAE and its devices every poll_interval seconds. The default value is 30 seconds.

Edit /etc/initd/sesmon

-t threshold

Log an event only after a failure has been seen for threshold polling periods.

Edit /etc/initd/sesmon