These release notes support ptx®/SVM V2.3.1, which is a powerful tool for managing disk-drive usage and disk space. Read this document before you install or run this release of ptx/SVM.
This version of ptx/SVM can be used with the following products:
See the DYNIX/ptx and Layered Products Software Installation Release Notes for further software product compatibility information.
ptx/SVM V2.3.x includes the following new features:
This release supports the use of ptx/SVM to manage shared storage on clusters containing more than two nodes. All previous restrictions on using ptx/SVM in clusters containing three and four nodes no longer apply.
This release supports a new command, vxolr, that performs most functions to remove and replace a disk that is under ptx/SVM control and restore the ptx/SVM configuration to the replacement disk. This procedure applies to ptx/SVM disks on clustered and non-clustered systems. See "Using vxolr to Perform Online Disk Replacement" later in these release notes for more information on this new feature.
You cannot directly upgrade to DYNIX/ptx V4.6.x and ptx/SVM V2.3.x if you are currently running DYNIX/ptx V4.2.x or lower or are currently running on Symmetry hardware. If you are currently running DYNIX/ptx V4.2.x or lower, contact IBM Professional Services for help upgrading your system.
If you have already upgraded to ptx/SVM V2.0.x, V2.1.x, V2.2.x, or V2.3.0, you can install this release of ptx/SVM along with DYNIX/ptx and any other layered software products through a single installation process. The DYNIX/ptx and Layered Products Software Installation Release Notes tell how to use this process to install all DYNIX/ptx and layered products software.
Although it is not necessary to deinstall ptx/SVM before installing a new version of the software, should you wish to deinstall ptx/SVM, the automatic deinstallation process through the ptx/ADMIN menu system will fail because root and swap volumes are in use. Follow these steps before deinstalling ptx/SVM through ptx/ADMIN to ensure the deinstallation will succeed:
ATTENTION Issuing the following command will disable all volumes. Make sure your system does not depend on any volumes for filesystems, swap space, or other uses that your system requires to boot.
Prevent vxconfigd from starting during the boot process:
# touch /etc/vx/reconfig.d/state.d/install-db
Disable the use of early-access volumes on the next reboot:
# bp /unix vol_disable_early_access_volumes 1
Reboot the system.
The following sections discuss important usage considerations with this release of ptx/SVM.
Because the ptx/SVM menu system, vxdiskadm, currently fails to work with shared disks (Problem Report #236943) and does not properly perform OLR of sliced and simple disks (Problem Report #242863), IBM recommends that you do not use vxdiskadm for shared or private operations. Specifically, do not use vxdiskadm options (5) Remove a disk for replacement, (6) Replace a failed or removed disk, and (15) OLR. Instead, if you wish to replace a failed or failing disk, use the manual procedures for disk replacement, which are described in the DYNIX/ptx System Recovery and Troubleshooting Guide, or the vxolr utility, which is described later in these release notes in "Using vxolr to Perform Online Disk Replacement" and in the DYNIX/ptx System Recovery and Troubleshooting Guide.
ptx/SVM V2.x requires that every shareable private spindle have at least one private region. A shareable disk is one that is on the Fibre Channel and that could be shared. Shareable spindles that are of type nopriv cannot be added to private disk groups, therefore, because nopriv disks do not contain private areas.
In order to place a shareable nopriv spindle under ptx/SVM control, you will have to re-partition the disk to create a type-8 slice on it for the private configuration database area. To determine if a disk is shareable, examine the output of the dumpconf -d command. Look for an "S" in the fifth column of the output. "S" signifies that a device is shareable.
DYNIX/ptx V4.6.x and later no longer use the /etc/devtab file to store information that devbuild uses to build virtual devices on disks. Instead, the information is stored in the device naming database; the CMPT virtual device is obsolete. In ptx/CLUSTERS V2.3.x and later and ptx/SVM V2.3.x and later, you need only build the VTOC driver for a disk on one node; the naming database will propagate the information to the remaining nodes and the ptx/SVM shared disk groups will match across the cluster.
Object manipulation performance in private disk groups in a cluster is significantly better than in shared disk groups. Use shared disk groups only when absolutely necessary (for data that must be shared among cluster nodes) to avoid incurring a performance penalty.
ptx/SVM lets you use the vxdg -k rmdisk medianame command to remove a disk media record even if the volume containing the disk is active. This operation is not recommended, especially if the data to be removed is not mirrored elsewhere. Should you use the vxdg -k rmdisk medianame command, use extra caution if the disk belongs to root volume, the primary swap volume, or any secondary swap volume. (The root volume and primary swap volume are named ROOTVOL and SWAPVOL by default but may be renamed; secondary swap volumes have no default names and thus are not easily recognized by names alone.) If you issue the command on a disk media record that is part of non-mirrored swap or a secondary swap volume, then the machine will crash or fail over.
The vxolr tool automates much of the OLR process for failing and failed disks that are under ptx/SVM control. vxolr prompts you throughout the OLR process for information and, if you specify verbose mode, reports to the screen each command that it runs to perform the OLR. Syntax for vxolr is:
vxolr [-l logfilename] [-R] [-v] disk
ATTENTION The vxolr command can only be run by the root user.
The -l option creates a user-defined log file of the OLR process in the file name you specify. Otherwise, a default log file is created automatically and is called /etc/vx/olr.log.
The -R option specifies online removal of the disk only, which cleanly removes ptx/SVM references to the disk before it is take off line. It does not specify online insertion operations.
The -v option specifies verbose mode, where all commands vxolr executes during the procedure are displayed on the screen (this option is recommended).
The vxolr OLR process pertains only to disks that are under ptx/SVM control. (Disks or slices of disks that are not until ptx/SVM control cannot be replaced through vxolr because they may contain objects such as raw databases and filesystems that are mounted on raw devices. Since ptx/SVM cannot locate these references if they exist, it cannot guarantee that removing the disk will not result in data loss.)
The following restrictions apply when using vxolr to remove and restore a ptx/SVM disk:
vxolr will not work on disks or disk slices that are not under ptx/SVM control.
You can use vxolr to OLR disks in private disk groups, shared disk groups, failover disk groups, and the root disk group; however, you cannot OLR a bootable, mirrored disk. Before performing OLR on a bootable disk you must change the bootflags to point to an alternate boot disk.
You can perform OLR on disks containing the root volume and the primary swap volume, but only if the disks are mirrored.
If a volume located on a disk that you wish to OLR is in a recovery state or opened by another ptx/SVM utility,vxolr will report that volumes are busy and will exit. You will need to restart vxolr later, once the volumes are no longer busy.
If a volume located on a disk that you wish to OLR contains a mounted filesystem, then vxolr will report that the volume is busy. You will need to unmount the filesystem before proceeding.
You can issue only one vxolr operation at a time on a system.
In a cluster, you cannot perform two OLR operations among shared disk groups; however, you can OLR shared and private disks on different nodes (not on the same node).
All cluster nodes must be running DYNIX/ptx V4.6.0 or later, ptx/SVM V2.3.0 or later, and ptx/CLUSTERS V2.3.0 or later in order for vxolr to work properly. The vxolr OLR process will not work during cluster rolling upgrades.
vxolr will not let you use a replacement disk that is smaller than the original disk.
If the disk to be replaced is the last disk in a disk group, prepare to put another sliced or simple disk into the disk group before the OLR operation so that there will be at least one copy of the configuration database remaining in the disk group.
The cluster membership should not change during the OLR process. The process of a cluster membership change during an OLR is liable to considerably lengthen the time it takes the OLR to complete.
The following list identifies the possible ways in which objects can be affected by disk removal during vxolr process:
A volume will lose data. The volume that resides on the disk to be removed is not mirrored.
A volume will no longer be mirrored. Once the failing disk is removed, a volume will lose its only mirror.
Dirty-region logging will be disabled. The failing disk contains the system's only dirty-region log and dirty-region logging will be disabled during the time period when the failing disk is removed.
A hot spare disk will not be mirrored. The failing disk contains the only mirror of a hot-spare disk. The hot-spare disk will no longer be mirrored during the time period when the failing disk is removed.
The last complete active plex will be lost. Data will be lost when the failing disk containing the last complete active plex is replaced.
In each of these cases, the vxolr script warns you if one of these scenarios will occur and gives you the option to exit the OLR procedure so that you can, for example, create a mirror for an unmirrored failing disk.
The vxolr procedure will exit and not proceed with the disk OLR in the following scenarios:
When the disk to be replaced is not under ptx/SVM control.
When the disk to be replaced contains the root volume and the primary swap volume and is not mirrored.
When the disk to be replaced contains the root volume and the primary swap volume and will lose its only mirror during the OLR procedure.
When the disk to be replaced is the last disk in a disk group.
When the disk to be replaced is a shared disk in a cluster that is being used by one of the nodes.
If vxconfigd fails (intentionally or unintentionally) during the OLR of a shared disk.
When another OLR operation in the system or another system in the cluster is in progress.
When the disk to be replaced is owned by another cluster.
When the system containing the disk to be replaced is in the midst of a rolling upgrade.
When you run vxolr with the verbose option (-v), all the commands the script executes are reported back to the screen so that you understand the OLR process and can more easily diagnose any problems vxolr encounters. When you first issue vxolr -v disk, the command will determine how the ptx/SVM configuration will be affected by the disk OLR The following table lists the warnings and errors vxolr may encounter and the action required:
vxolr Message | Action/Explanation | Can You Choose to Ignore? |
vxolr: WARNING: The above volume will lose a mirror during processing. | The volume's only mirror will be lost during the OLR procedure. | Yes, but the volume will be unmirrored while vxolr runs and you run the risk of losing data should the disk containing the existing plex fail. |
vxolr: WARNING: DRL feature will be lost during processing. | The disk to be replaced contains the system's only dirty-region log. | Yes, but no dirty-region logging will be available until the OLR procedure is complete. |
vxolr: WARNING: Volume SWAPVOL is ROOTVOL/SWAPVOL. You cannot replace disk disk without mirroring ROOTVOL/SWAPVOL |
The disk containing ROOTVOL and SWAPVOL is unmirrored and vxolr will not function unless it is mirrored. |
No. |
Diskgroup diskgroup hot spare will not be available during processing |
The disk to be replaced is the only hot spare disk for the disk group listed. No hot spare will be available for the disk group during the OLR procedure. | Yes, if you choose to risk not having a hot spare disk available for the duration of the OLR process. |
vxolr: WARNING: Volume volume will lose last complete active plex. |
The volume listed will lose data because only a partial plex will remain active during the OLR procedure. | Yes, if you choose to risk losing data. |
vxolr: WARNING: The disk disk is the last disk in diskgroup diskgroup. |
The disk you are attempting to replace is the last disk in the disk group and cannot be removed. |
Yes, but you will have to deport the disk group first or add another disk to the disk group and rerun vxolr . |
vxolr: ERROR: This disk is owned by another cluster. | Another cluster owns the disk you are attempting to replace. Only disks in systems owned by the cluster can be removed from the current system and cluster. | No. |
vxolr: ERROR: Another vxolr is in progress in the cluster. |
Only one vxolr process on a shared disk group can be executed at a time in a cluster. |
No. vxolr will exit and not let you continue until the cluster's currentvxolr process is complete. |
vxolr: ERROR The system is in a rolling upgrade. | A rolling upgrade is in progress on the system. |
No. vxolr will exit and not let you continue until the rolling upgrade is complete. |
vxolr: ERROR: There is another vxolr running. |
Only one vxolr operation can be running per node at one time. |
No. vxolr will exit and not let you continue until the othervxolr operation completes. |
vxolr: ERROR This device is not a valid device. |
The vxolr procedure only works on valid devices known to DYNIX/ptx and ptx/SVM. |
No. |
vxolr: ERROR: This disk is not under SVM control. | vxolr can only be used on disks that are under ptx/SVM control. |
No. |
vxolr: WARNING: Last database configuration copy in diskgroup. | The only configuration database in the disk group will be lost during the OLR procedure. | Yes, or you can exit and add another sliced or simple disk to the disk group and start again so that there will be at least one remaining configuration database copy during the OLR procedure. |
vxolr provides the following possible error handling options if the OLR operation fails.
In the online removal phase, you can back out of the completed operations, resulting in the same set of devices and states that were present when the OLR operation was started. Doing so makes vxolr restartable immediately because all previous changes have been backed out.
If vxolr encounters an error while trying to back out the changes, then it does not attempt any further recovery. If the error left the operation in an unrestartable state, the recovery will need to be performed manually, possibly with the assistance of Customer Support. Although this is expected to be a rare occurrence, it is a possibility.
In the online insertion phase, you can complete the procedure in spite of an encountered error. vxolr will not allow the previous steps to be backed out (on the assumption you would not want to take the good replacement disk out and replace it with the original bad disk).
In any phase, you can choose to open a shell, without exiting vxolr, to fix the problem immediately. When you exit the shell, the vxolr operation that fail will be attempted again.
In any phase, you can choose to exit vxolr without attempting to remedy the error.
A non-zero exit code is not a compete indicator of the problems encountered; instead, it denotes the first condition that prevented further execution of the utility.
The following documentation is available on the online documentation CD and at http://webdocs.numaq.ibm.com/:
ptx/SVM Administration Guide
ptx/SVM Error Messages
The ptx/SVM Quick Reference Card is available in hard copy only and is shipped with each order of ptx/SVM.
ATTENTION All ptx/SVM troubleshooting information, including manual OLR procedures, are now located in the DYNIX/ptx System Recovery and Troubleshooting Guide, which discusses all aspects of DYNIX/ptx-related troubleshooting. The ptx/SVM Administration Guide no longer contains ptx/SVM troubleshooting information.
This section lists the following problem report summaries:
The numbers in parentheses identify the problems in the problem-tracking system.
The following problems have been fixed in ptx/SVM V2.3.1.
(253796, 245708) The vxdg man page was not clear about what happens if the -h option of vxdg deport was used when disk group names were in conflict on a new host.
(253644) Memory for the svm_info structure ewas not released in some cases, causing memory leaks.
(250616) The volkprint routine output an incorrect device list if a shared device was open.
(250522) The function client_add() always returned the same vaue (0), yet the comments for the function indicated failure should return -1 and success should return 0.
(250456) ptx/SVM did not properly deal with a master takeover during detach processing. Although serialization requests were reissued, detach requests which went to the new master were returned errors, since the new master did not have enough state to deal with the requests.
(250436) Deadlock between vxstat and a vxconfigd operation occurred when vxstat was running on a shared volume and the cluster aborted.
(250412) vxconfigd hung in a msghead call.
(250270) vxconfigd deadlocked itself by holding the read copy of the lock volop_rwsleep and requests for the write copy of the lock during a transaction abort operation.
(247365) A missing disk media record caused the kernel to erroneously disable a shared disk group after the master transition.
(246974) klog update failures were ignored on shared disk groups.
(246898) The vxplex mv command could cause data loss, because the command did not check the lengths of the source and destination plexes.
(246525) When a klog error of any kind occurred during a master klog update, it was not reported in the kmsg response packet.
(243268) The same volume device could be opened multiple times exclusively because the volume device driver did not check the FEXCL flag for opens.
(242846) The system booted to multiuser mode with root and swap on the same partition. ptx/SVM now does not permit this to happen.
(236450) An incorrect error message appeared when vxmake vol with type root failed.
(234848) vxassist created a volume with a length that is shorter than the length of the plex.
(231971) vxdisk gave the wrong usage message when vxdisk -f check was entered.
The following problems were fixed in ptx/SVM V2.3.0.
(252972) A join failed if vxconfigd died and was started after a transition.
(252481) Opening and closing of devices was inefficient and slow.
(251919) Examples in the vxmake man page were wrong; they referred to volmake instead of vxmake.
(251471) Unneeded calls were made to volsio_stabilize().
(251225) A locking problem was detected during shutdown.
(251187) Massive resilvering occurred on the master when the slave rebooted and failed to join the cluster. This resulted in failure of the master vxconfigd.
(250959) On a three-node cluster, when a shared disk was added to a shared disk group from one of the slaves, all three nodes hung.
(250813) The vxconfigd man page did not specify the default logfile name.
(250558) When an I/O error occurs on a DRL plex, the logging plex should become detached; however, this was not happening.
(250465) A hang occurred in the dirty-region log, due to memory corruption.
(250401) The vxassist man page was not explicit enough in explaining how to remove a dirty-region log.
(249607) Read policies were incorrectly set to plex=0 after a reboot.
(248773) A confusing error message was returned after the volume usage type was changed to "root".
(244168, 254736) Volumes were left in SYNC state if plexes were offlined.
(241205) ptx/SVM did not realize a disk had changed when a private area on a spindle was reinitialized.
(240316) The vxdg free command issued on a slave node did not provide private disk group information.
(237191) vxvol start sometimes failed anad returned no error message.
(227112) The maxsize and maxgrow keywords of vxassist were not listed in the man page.
(221620) Invalid values for loglen were ignored.
(221607) A volume, plex, or subdisk in rootdg prevented the autoimport of a disk group with the same name.
This section lists open problems in this release of ptx/SVM.
By default, vxconfigd sends errors to the console. It should also send errors to a log file or through syslog.
Workaround. None.
The vxmake command limits the number of objects that can be created in a single invocation. The formula for determining the limits depends on the names of the objects involved as well as the kinds of objects and the values of some other fields in the objects. If the request fails with the message Configuration too large, you should try splitting the description file into two files and running vxmake twice.
When attempting to determine the size of the configuration database (private area) of a disk, a good rule of thumb is to allow about 400 bytes for each object. A 20,000 object database, for example, would require a minimum of 7800 blocks (1024-byte blocks).
Workaround. If the configuration database is undersized and you receive the message Not enough space in database, the problem affects only the disk group containing the database. To enlarge the database, make sure that all copies of the database are made larger by using the OLR procedure to remove the disks with small configuration areas and replace them with disks with larger configuration areas. This may mean that some data will have to be moved off of the disk in order to make room for the larger private area.
If, while vxrecover is running, you attempt to deport disk groups on which vxrecover still intends to perform recovery, vxrecover will return messages saying that it can't find volumes on the disk groups you are attempting to deport.
Workaround. Although the messages are harmless, you can avoid receiving them by not deporting disk groups until recovery is complete (when no volumes are in the SYNC or NEEDSYNC state).
The vxstat command is restricted for use by the root user only.
Workaround. None.
Mirrored volumes cannot detect any tampering done to them outside of ptx/SVM control. The tampering can be accomplished by modifying the underlying devices before ptx/SVM is started, such as from the stand-alone kernel on NUMA systems with Direct-Connect interconnects. In the case of the root volume, remounting the root filesystem so that it is writable from either the stand-alone kernel or single-user mode will undetectably render the volume out of sync such that successive reads of the same area will return different results.
Workaround. Before tampering with any component of a ptx/SVM volume, modify the volume so that it is not mirrored. Never use the /stand/sci/mrootsau script (or any other method for remounting the root read/write) on a system with a mirrored root.
Automatic deinstallation of ptx/SVM (through the ptx/ADMIN® menu system) fails because root and swap volumes are in use.
Workaround. Before attempting to deinstall ptx/SVM, execute the commands touch /etc/vx/reconfig.d/state.d/install-db and bp /unix vol_disable_early_access_volumes 1 and reboot the system.
ATTENTION Issuing these commands will disable all volumes. Make sure your system does not depend on any volumes for filesystems, swap space, or other uses that your system requires to boot.
ptx/SVM allows you to attach a plex containing more than one subdisk to the root volume.
Workaround. Attach only one subdisk to a plex in the root volume.
The vxdiskadm option to offline a disk does not work.
Workaround. Use the devctl -d -D command to offline a disk.
When ptx/SVM is installed but not enabled, vxiod processes are started anyway.
Workaround. This is expected behavior and there is no workaround.
ptx/SVM support for dumpcfg (/etc/dumpcfg.d/vx) does not correctly handle disk groups other than rootdg.
Workaround. None.
The vxconfigd -k command may fail to reimport all disk groups.
Workaround. Import the disk groups manually.
If a disk in a subdisk fails and you try to create a plex with that subdisk with vxmake, vxconfigd will terminate and hang vxmake.
Workaround. None.
The vxdiskadm OLR option fails on shared disks
Workaround. None.
The following message may appear after a node goes down, while a vxplex att operation is in progress, or if a vxplex/vxrecover operation is killed locally:
vxplex -g diskgroup det plex
vxvm:vxplex: ERROR: Volume volume is locked by another utility
Workaround. Clear the "tutil0" field of the volume and plex with the following commands:
# vxmend clear tutil0 volume
# vxmend clear tutil0 plex
When a mirrored volume is stopped with vxvol -o force stop, a resynchronization is started on the volume. If the volume is forced to stop again, the offset counter remains partway through the volume when the volume is restarted again. The offset counter should be reset to zero.
Workaround. Do not use the -o force option to stop mirrored volumes. If you use the -o force option and the offset is not returned to zero, reboot the system to return the offset to zero.
Problems can arise when the vxconfigd -k command is issued on the slave node and, while the command is processing, a shared disk group is deported from the master. When the slave finishes with the new vxconfigd, it reports the shared disk group is disabled but imported. The master does not show the shared disk group as imported. You cannot deport the disk group because the master does not see it. You cannot import it again because the slave cannot see one or more of the disks in the disk group.
Workaround. Restart vxconfigd on the slave again.
If you force a disk into a disk group with vxdisk -f, but the disk previously belonged to a different disk group, the disk can end up belonging to both disk groups.
Workaround. Do not use the -f (force) option when adding a disk to a disk group.
Asynchronous reads at 32 KB I/O size cause performance problems because the system is set up for reads at 128 KB I/O size.
Workaround. None.
It is possible to create a shared disk group on the master node that has the same base minor number as a private disk group on the slave node.
Workaround. Deport the shared disk group with the same base minor number as the private disk group and remake it.
When both cluster nodes have a private disk group with the same name that contains shared disks, they receive a warning message when the nodes are booted because although the group name matches, the disk IDs point to the wrong side.
Workaround. If using shared disks in private disk groups, you should not have the same disk group name on both nodes. Otherwise, you can just ignore the error message.
If quorum is lost and then regained, resilvering seems to hang.
Workaround. Stop the resilvering and restart it manually.
The S03reconfig script outputs the following incorrect error message:
vxvm:vxdisk: TO FIX: Usage:
vxdisk [-f] [-g diskgroup] [-qs] list [disk ...]
Workaround. Ignore the message.
If you change the name of a shared disk group and the master node panics, the new master may only partially pick up the name change. This may make the output of vxdisk list and vxdg list confusing.
Workaround. Deport the disk group from the new master and reimport it before bringing the node that is down back into the cluster.
vxdiskadm fails to execute an OLR procedure on a sliced disk or on a simple disk that is in a state of NODEVICE, and gives no indication of the failure.
Workaround. Do not use svmadmin to perform OLR. Use the manual procedure described in the ptx/SVM Administration Guide.
When you issue the vxdisk -f init disk command to create a disk access record for a disk (if one does not already exist on the disk), you will be returned the following error:
vxvm:vxdisk: ERROR: Device disk: No DA record exists in configuration
Workaround. Use vxdctl enable to create a disk access record for the disk.
If a mirrored volume is in read-writeback mode and n-1 plexes are offlined before the resynchronization operation is completed, the volume is left in either the NEEDSYNC or the SYNC state.
Workaround. Stop and restart the volume.
If you create a dirty-region log (DRL) subdisk on a shared volume and the master crashes when the system comes back up, the system will hang in S03svm-awaitjoin upon reboot.
Workaround. Do not use dirty-region logging on shared volumes.
The vxtrace command only tracks local I/O; it does not work clusterwide.
Workaround. Use vxtrace on each individual node, or just for local I/O traffic on one node.
If you create the rootdg prior to setting the node name on a system, ptx/SVM will have problems importing disk groups.
Workaround. Set the system's node name correctly before creating the root disk group. See the section entitled "Start the ptx/SVM Configuration Daemon" in Chapter 2 of the ptx/SVM Administration Guide for information on how to set up the root disk group.
If a klog update fails for any reason on all private areas belonging to a disk group, the system panics.
Workaround. Design the object configuration for each disk group such that if any of piece of hardware fails, you should still be able to access at least one private area on a disk belonging to the shared disk group.
When a plex detach state change cannot be recorded in the klog area (the klog flush fails), the volume is detached to prevent data corruption. This behavior is the same with private and shared disk groups.
A shared disk group will not be disabled when there is an error only in the klog area of the disk (klog flush error). However, any disk group will be disabled when none of the configuration copies on the disk can be updated. Also, a shared disk group will be disabled only on the master node and there is no need to communicate the disk group disable to the slaves since the slaves are not allowed to do transactions. I/Os can still go through the volumes.
Workaround. None.
When you use disks that are not shareable among all cluster nodes to create shared disk groups in a cluster where only one node is active, other nodes will be prevented from subsequently joining the cluster. They cannot see the disk when they attempt to join the cluster.
Workaround. Try removing the local disks from the disk group. If that is not possible, then deport the disk group from the master so that the remaining nodes can join the cluster.
Manually resynching shared volumes from the slave is possible but not recommended. There are two reasons for this. One is there is a slight performance cost, since the master must still be involved in some aspects of the resynching. Two is that if the master dies while resynching, the resynching will hang until the slave becomes the master. If, during this time, the slave also dies, it is possible these volumes will be left in read/writeback mode, incurring no data loss but a moderate performance penalty.
Workaround. Rebooting both nodes can clear the condition in some cases, but can be disruptive. In other cases, it may still be necessary to manually remove and recreate the volumes to clear this state. Removing and recreating these volumes without disrupting normal operation may be difficult or in some cases impossible; customers may need to call Customer Support for assistance.
Do not use vxassist to create a mirror on the same disk as the first plex. vxassist will return the following error:
# vxassist mirror ROOTVOL disk
vxvm:vxassist: ERROR: Cannot allocate space to mirror nnnnnn block volume
Workaround. To use vxassist, specify a different spindle. Otherwise, use vxmake.
The vxassist man page should state that the snapstart and snapshot options do not work for alternate root partitions.
Workaround. None. Mirror the root volume instead.
A plex cannot contain subdisks that have punctuation (such as plus signs or apostrophes) in their names. (Underscores are acceptable.)
Workaround. None.
A disk with no VTOC can be specifically defined as type "nopriv" (for example, vxdisk define sd5 type=nopriv). This disk could then be added to a shared disk group (where sd5 is on a shared spindle) with vxdg -g shareddg adddisk sd5, where shareddg is a previously defined shared disk group. However, a "nopriv" ptx/SVM disk has no private area to store the configuration database, and this could lead to data corruption since the disk can be added to different disk groups at different points in time.
ATTENTION The default for a vxdisk define operation is to create a disk of type "simple". So a vxdisk define sd5 command will create sd5 to be a "simple" ptx/SVM disk.
Another manifestation of this problem is that a "nopriv" ptx/SVM disk (on a disk which has a VTOC on it) on a non-shared spindle can be added to different private disk groups at different points in time when no other partition from the same disk is added to the shared disk group. This could also lead to data corruption.
Workaround. None. Avoid defining disks on shared spindles (with no VTOCs) as "nopriv" ptx/SVM disks and then adding these disks to different disk groups at different points in time. Also avoid adding "nopriv" ptx/SVM disks (with VTOCs on non-shared spindles) to different disk groups at different points in time when no other partition from the same disk is added to the disk group. Either of these actions could lead to data corruption.
The disk OLR procedure documented in Appendix C of the ptx/SVM Administration Guide fails when the replacement disk already contains a ptx/SVM private area. That is, the replacement disk was used in a private disk group.
Workaround. During the vxdctl enable step of the procedure, issue vxdctl enable twice.
ptx/SVM volumes, except for the root volume and the primary swap volume, are not available in single-user mode during the boot process.
The S09logicalroot startup script attempts to mount /var/ees on the devices specified in /etc/vfstab file. If the devices pointed to them are volumes, the mount will fail. This can lead to problems during an upgrade of ptx/SVM software if the system is configured to use volumes as mount points /var/ees.
Workaround. Do not specify volumes as mount point of /var/ees in /etc/vfstab file. Disk partition devices can be specified.
Before a software upgrade, move all data in the volume to a contiguous partition, and then change the mount point to point to the partition.
Disks of type "nopriv" cannot be used for hot sparing. When a disk encounters a persistent I/O failure, ptx/SVM determines that the disk has failed by trying to update the private area of the disk with the failure information. If the update to the private area fails, then ptx/SVM determines that the disk has failed and then hot spares the disk. Since "nopriv" disks have no private area, the hot sparing protocol has no way of validating that the disk has failed.
Additionally, since ptx/SVM requires that all spare disks contain a private area onto which any lost private data can be restored, "nopriv" disks cannot be designated as hot-spare disks.
Workaround. None. Do not designate "nopriv" disks as hot-spare disks.
The vxdg adddisk command does not check the public length area of a sliced disk that was previously under ptx/SVM control on another system, allowing ptx/SVM to think a partition is a different size than it really is.
Workaround. Issue the vxdisk init command on the sliced partition before moving the disk from one ptx/SVM system to another.
If a cluster node goes down and remains down for a significant period of time, and ptx/CLUSTERS does not notify ptx/SVM that the node has gone down, ptx/SVM may generate a debug message every five seconds.
Workaround. Reboot the node that is down, or restart vxconfigd.
When vxmake -d is given a description file generated by vxprint -m, it fails with the following error:
vxmake -d desc_file
vxvm:vxmake: ERROR: TRANSACTION FAILED: Association count is incorrect
vxvm:vxmake: ERROR: associating disk-media disk with disk:
Association count is incorrect
Workaround. In the description file, either comment out or set to 1 (instead of 4) the sd_num parameter.
It is possible to boot from a node-owned disk that is part of a private disk group other than the root disk group (rootdg). After it is booted, the system still shows the ROOTVOL and SWAPVOL volumes as part of the rootdg.
Workaround. Do not try to boot from a disk that is not part of the root disk group (rootdg).
On master takeover, the new master will try to create a temporary database for each shared disk group in the /etc/vx/tempdb directory. If there are errors in creating the temporary database for any shared disk group, that disk group will be disabled.
Workaround. After the master goes to multiuser mode, the system administrator should resolve the problem that caused the creation of the temporary database file to fail. Then the disk groups can be enabled by restarting vxconfigd.
Resilvering performance on four nodes is not optimal, although significant performance degradation has not been observed.
Workaround. None.
Failing "nopriv" disks may not be replaced by hot spare disks, even when spare disks are configured in the disk group.
Workaround. Configure ptx/SVM objects on "nopriv" disks only if a "nopriv" disk failure is tolerable. We recommend that you do not use "nopriv" disks to configure ptx/SVM objects.