IBM(R) Total Storage Multipath Subsystem Device Driver Version 1.6.0.1-10 README for Linux 2.6 September 06, 2005 ------------------------------------------------------------------------------- CONTENTS 1.0 About this README file 1.1 Who should read this README file 1.2 How to get help 2.0 Prerequisites for SDD 3.0 SDD Change History 3.1 Defects Fixed 3.1.1 Common 3.1.2 ESS/DS8000/DS6000 Defects 3.1.3 SVC Defects 3.1.4 SVCCISCO Defects 3.2 New Features 3.3 Feature Details 3.4 Known Issues 3.5 Correction to User's Guide 4.0 User License Agreement for IBM Device Drivers 5.0 Notices 6.0 Trademarks and Service Marks ------------------------------------------------------------------------------- 1.0 About this README file Welcome to IBM Totalstorage Multipath Subsystem Device Driver (SDD). This README file contains the most recent information about the IBM Totalstorage Multipath Subsystem Device Driver, Version 1.6.0.1-10 for Linux. IBM recommends that you go to the following Web site to get the most current information about this release of SDD: http://www.ibm.com/servers/storage/support/software/sdd.html You should carefully review the following information available through the website: 1. The most current README file. This will contain corrections to this readme file, corrections to the SDD User's Guide, and other documentation updates discovered since this copy of the README was prepared. 2. The Multipath SDD User's Guide. Because SDD can be installed in so many different environments / configurations, detailed information about each environment is placed in the appropriate chapter of the Multipath SDD Users Guide. The 'Summary of Changes' section of the SDD Users' Guide can help you quickly determine if the latest changes affect you. 3. The Flashes. As we become aware of any information that is likely to impact a broad set of our customers, Flashes are prepared and posted on this site. You should review this section periodocally to see any new Flashes that have been posted since your last review. For prerequisites information, be sure to look in the Multipath SDD User's Guide as well as the prerequisites section of this readme file for the latest updates. 1.1 Who should read this README file This README file is intended for storage administrators, system programmers, and performance and capacity analysts. The information in this file only applies to customers who run: 1. ESS, DS8000, DS6000, SAN Volume Controller, or SAN Volume Controller for Cisco MDS 9000 2. ESS and SAN Volume Controller 3. ESS and SAN Volume Controller for Cisco MDS 9000 1.2 How to get help Go to the following Web site for SDD technical support and for the most current SDD documentation and support information: http://www.ibm.com/servers/storage/support/software/sdd.html Go to the following Web site for the IBM ESS Open Systems support: http://www.storage.ibm.com/hardsoft/products/ess/supserver.htm Go to the following Web site for IBM TotalStorage DS8000 support: http://www.ibm.com/servers/storage/disk/ds8000/index.html Go to the following Web site for IBM TotalStorage DS6000 support: http://www.ibm.com/servers/storage/disk/ds6000/index.html Go to the following Web site for IBM TotalStorage SAN Volume Controller support: http://www.ibm.com/servers/storage/support/virtual/2145.html Go to the following Web site for IBM TotalStorage SAN Volume Controller for Cisco MDS 9000 support: http://www.ibm.com/servers/storage/support/virtual/2062-2300.html Call one of the following numbers to obtain nontechnical or administrative support, such as hardware and software orders, hardware maintenance, services contract of entitlement, and invoices: .For commercial or state and local support operations: 1-877-426-6006 (Listen to the voice prompts) .For business partner support operations: 1-800-426-9990 .For federal government support operations: 1-800-333-6705 ------------------------------------------------------------------------------- 2.0 Prerequisites for SDD One of the following combinations of a Linux distribution and associated vendor compiled binary kernel packages: SuSE SLES 9 (x86) with one of the following kernels: kernel-smp-2.6.5-7.139 kernel-bigsmp-2.6.5-7.139 kernel-smp-2.6.5-7.145 kernel-bigsmp-2.6.5-7.145 kernel-smp-2.6.5-7.147 kernel-bigsmp-2.6.5-7.147 kernel-smp-2.6.5-7.151 kernel-bigsmp-2.6.5-7.151 kernel-smp-2.6.5-7.155.29 kernel-bigsmp-2.6.5-7.155.29 kernel-smp-2.6.5-7.191 kernel-bigsmp-2.6.5-7.191 kernel-smp-2.6.5-7.193 * kernel-bigsmp-2.6.5-7.193 * SuSE SLES 9 (ppc64) with one of the following kernels: kernel-pseries64-2.6.5-7.139 kernel-pseries64-2.6.5-7.145 kernel-pseries64-2.6.5-7.151 kernel-pseries64-2.6.5-7.155.29 kernel-pseries64-2.6.5-7.191 kernel-pseries64-2.6.5-7.193 * SuSE SLES 9 (x86_64) with one of the following kernels: kernel-smp-2.6.5-7.191 * kernel-smp-2.6.5-7.193 * Red Hat Enterprise Linux 4 (x86) with one of the following kernels: kernel-smp-2.6.9-5.EL kernel-hugemem-2.6.9-5.EL kernel-smp-2.6.9-5.0.3.EL kernel-hugemem-2.6.9-5.0.3.EL kernel-smp-2.6.9-5.0.5.EL kernel-hugemem-2.6.9-5.0.5.EL kernel-smp-2.6.9-11.EL kernel-hugemem-2.6.9-11.EL Red Hat Enterprise Linux 4 (x86_64) with one of the following kernels: kernel-smp-2.6.9-11.EL * * Newly supported in this release. ------------------------------------------------------------------------------- 3.0 SDD Change History =============================================================================== 3.1 Defects Fixed 3.1.1 Common 1.6.0.1-10 o 3366 SDDSRV stops logging when log file is removed. o 3484 Enable system error logging for SDDSRV. o 3489 Fix errors in adapter state and mode transitions during a fatal write error, i.e. an I/O to a LUN fails across all available paths triggering SDD to permanently disable the vpath by putting its paths into DEAD state and OFFLINE mode. 1.6.0.1-8 o 3473 Ensure that the command-line 'debug' options, i.e. 'cfgvpath debug' and 'rmvpath debug', have consistent descriptions and behavior, printing the debug messages to stdout. o 3475 Fix incomplete solution to 3461 (see below). Pass through Linux HDIO_GETGEO ioctl calls correctly down to the scsi disk device for vpath partitions instead of just to the whole device. o 3450 Fix problems in path state transition. 1.6.0.1-6 o 3407 Fix incomplete request queue handling fix from defect 3364. 1.6.0.1-5 o 3365 Fix cfgvpath to wait a maximum of a minute for vpath device nodes to appear before exiting and printing an error. o 3364 Fix to allow vpath to handle bigger I/O requests to match sd device behavior. o 3341 Fix problem where /etc/vpath.conf was not updated with debugging enabled, i.e. 'rmvpath debug'. o 3340 Fix 'datapath query wwpn' to handle more than ten adapter wwpns. 1.6.0.1-3 o 3156 Fix potential system hang during sdd.log trace collection. 1.6.0.1-2 o 3125 Fix incorrect creation and handling of exclusion list used by SDD during configuration. 3.1.2 ESS defects None 3.1.3 SVC defects None 3.1.4 SVCCISCO defects None =============================================================================== 3.2 New Features 1.6.0.1-10 o PCR 2889 Add SDD trace collection script for problem determination (sddgetdata). o OAR 1003 Add x86_64 support for Red Hat Enterprise Linux 4. o OAR 1012 Add x86_64 support for SuSE Linux Enterprise Server 9. 1.6.0.1-8 o sw 3474 Add support for SuSE Linux Enterprise Server 9 Service Pack 2 (Linux 2.6 kernel). o sw 3456 Add "makenodes" argument to the cfgvpath command to enable cfgvpath to create device nodes similar to what is done in Linux 2.4. 1.6.0.1-4 o PCR 2911 Add support for Red Hat Enterprise Linux 4 (Linux 2.6 kernel). 1.6.0.1-3 o PCR 2886 Add "-l" option to datapath query device command to indicate paths to a non-preferred controller. 1.6.0.1-2 o PCR 2447 Add support for SuSE Linux Enterprise Server 9 Service Pack 1 (Linux 2.6 kernel). =============================================================================== 3.3 Feature Details 1.6.0.1-10 o PCR 2889 Add SDD trace collection script (located at /opt/IBMsdd/bin/sddgetdata). Run the script "sddgetdata" to collect SDD trace data and host logs to support SDD problem determination and resolution. The resulting file (sdd_data_$date_time$.tar.gz) will be saved to the current working directory. o OAR 1003, 1012 Add x86_64 support for Red Hat Enterprise Linux 4 and SuSE Linux Enterprise Server 9. See list above for supported kernel levels. 1.6.0.1-8 o sw 3474 Add support for SuSE Linux Enterprise Server 9 Service Pack 2 (Linux 2.6 kernel). See above for supported kernel levels. o sw 3456 Add "makenodes" argument to the cfgvpath command to enable cfgvpath to create device nodes similar to what is done in Linux 2.4. Usually, the creation of the device nodes in Linux 2.6 is done automatically using a combination of the udev and hotplug systems. However, it is sometimes desirable to have cfgvpath manually create nodes for the user (one good example is during the remote boot process, when the udev and hotplug systems are not loaded yet). The device nodes reside in /dev/vpathXX and are one of the primary ways the user can access an SDD vpath device. 1.6.0.1-4 o PCR 2911 Add support for Red Hat Enterprise Linux (RHEL) 4 running the Linux 2.6 kernel. See above for supported kernel levels. 1.6.0.1-3 o PCR 2886 A new feature has been introduced to datapath to aid users to in verifying their SAN configurations in a controller environment (such as with the SAN Volume Controller). The datapath query device command now has a new option '-l' to display paths to non-preferred controllers. For example, if users have 4 paths per SDD vpath device and they would like to configure a equal distribution between preferred controller and non-preferred controller, they will configure their environment with 2 paths from a preferred controller and another 2 paths from a non-preferred one. This feature will help users to verify their configuration by indicating which path is from the non-preferred controller before the device starts operation. For 'datapath query device', a new option, '-l', is added to mark the non-preferred paths with an asterisk. This option can be used in addition to the existing datapath query device command. For example: - datapath query device -l - datapath query device -l In the datapath query device output, the non-preferred paths are marked with a *, as below: DEV#: 35 DEVICE NAME: vpathbd TYPE: 2145 POLICY: Optimized Sequential SERIAL: 60050768018b800a800000000000008c LUN IDENTIFIER: 60050768018b800a800000000000008c ============================================================================ Path# Adapter/Hard Disk State Mode Select Errors 0* Host3Channel0/sdcu CLOSE NORMAL 0 0 1 Host3Channel0/sddc CLOSE NORMAL 22985 0 2 Host2Channel0/sds CLOSE NORMAL 26398 0 3* Host2Channel0/sdk CLOSE NORMAL 0 0 1.6.0.1-2 o PCR 2447 Add support for SuSE Linux Enterprise Server (SLES) 9 Service Pack 1 running on the Linux 2.6 kernel. Note that the Linux 2.6 kernel is significantly different from the Linux 2.4 kernel supported on our previous releases. See above for supported kernel levels. =============================================================================== 3.4 Known Issues o Setting the HBA queue depth For IBM DS6000 using SDD multipathing, it is recommended that the queue depth for the HBA be lowered from the default. The actual value that should be used is calculated based on the number of active paths to the storage device and the default queue depth of the adapter. For an example queue_depth calculation using a Qlogic HBA with 4 active paths, refer to the "Timeouts with Qlogic qla2x00 8.00.00 host bus adapter driver" note below. o Setting SCSI midlayer timeout values due to loaded storage targets IBM storage devices require a longer time period to retire an IO command issued by an initiator under heavy load. By default, the scsi midlayer only allots 30 seconds per scsi command before issuing an abort on the IO to the initiator. We suggest setting the value to 60 seconds by default. Should you see scsi errors of value 0x6000000, lun reset messages or abort IO messages, a new timeout setting may help alleviate that situation. It may also be necessary to stop all IO and allow the target to retire all outstanding IO before starting IO again with the new timeout. To set the timeout value to 60 seconds instead of 30 seconds, you can use one of two methods: (1) Emulex tool: You can acquire a script called set_timeout_target.sh at the Emulex website under the Linux tools page. Since this script deals with SCSI disk devices, it can work equally well in environments which use Qlogic host bus adapters and not Emulex HBAs. Details on how to use the tool are available on the Emulex website. (2) Manual process: You can manually set the timeout value through the sysfs interface. Execute the following command: echo 60 > /sys/class/scsi_device/:::/timeout Replace the items in <> with the following (you can match with the values in /proc/scsi/scsi). host - host number channel - channel number target - target number lun - lun number o Module loading at boot time When installing with supported fibre-channel Host Bus Adapters (HBAs), SLES 9 will load the fibre-channel adapter driver earlier in the OS boot-up order than the internal SCSI adapter card driver. This may cause problems because the internal disks are usually referenced using a static device entry (such as "/dev/sda3") in the /etc/fstab entry for the OS to load the root filesystem disk. The loading of the HBA driver could cause a disk on the SAN (such as from an ESS) to be loaded as "/dev/sda" and the real local root disk to be moved to something like "/dev/sdb". This static naming convention could cause your system to crash at boot time. One workaround is to reorder the entries in the configuration file /etc/sysconfig/kernel. This file indicates the order in which the drivers are loaded in the initial ramdisk image Linux uses in order to boot (called the initrd). The INITRD_MODULES parameter determines the driver load order at boot time. You might have something that looks like this: INITRD_MODULES="lpfcdd sym53c8xxi" where "lpfcdd" is the Emulex HBA driver and "sym53c8xxi" is the internal SCSI driver. You would want to place the lpfcdd entry after the internal SCSI driver entry, such as this: INITRD_MODULES="sym53c8xxi lpfcdd" After you change this entry, run the command "mkinitrd." This will create a new initial ramdisk image with the driver load order changed. Next time you reboot the system, the new order will go into effect. If you have configured your SAN LUNs already and the system is currently crashing at boot time (i.e. you are seeing the symptom above), you can unplug the cables from the HBA to get the system to boot up normally. Then, follow the above steps to change the driver load order and plug the cables back in during the next reboot. o Timeouts with Qlogic qla2x00 8.00.00 host bus adapter driver The Qlogic 8.00.00 driver enforces an I/O queue depth limit per path, not per LUN. This value is controlled through Qlogic's ql2xmaxqdepth parameter. Since the Qlogic driver does not enforce a queue depth limit per LUN, multiplying the number of paths to a LUN will also multiply the maximum queue depth per LUN. For example, using SDD with 4 paths to a LUN and a queue depth of 32 will allow up to 32 x 4 = 128 I/O requests to be queued to the LUN at an any instant in time. Thus, using SDD with the Qlogic driver can significantly multiply the I/O load to a LUN versus what is normally generated using a single-pathed solution. In some scenarios, heavy I/O load may cause many I/O requests to timeout because the ESS storage is overloaded with I/O requests and is taking longer to respond. Typically, this is indicated through a series of SCSI errors with a 0x20000 return code. For example, you may see a series of error messages for different paths in /var/log/messages that resemble: kernel: SCSI error : <7 0 0 4> return code = 0x20000 In addition, having saturated queues is another clue that indicates your storage is overloaded. The queues are saturated if the "Pending reqs" queue depth values listed in the "SCSI LUN Information" section of /proc/scsi/qla2xxx/[port_number] are close to or equal to the queue depth limit. For example, a part of an entry for a full path queue may look like this, assuming a queue depth limit of 32 is being used: SCSI LUN Information: (Id:Lun) * - indicates lun is not registered with the OS. ( 0: 4): Total reqs 1817, Pending reqs 32, flags 0x0, 0:0:84 00 ( 0: 5): Total reqs 2555, Pending reqs 32, flags 0x0, 0:0:81 00 ( 0: 6): Total reqs 3003, Pending reqs 32, flags 0x0, 0:0:81 00 ( 0: 7): Total reqs 1971, Pending reqs 32, flags 0x0, 0:0:81 00 Consequently, to avoid overloading a LUN, you can manually enforce a queue depth limit per LUN by adjusting the Qlogic driver's queue depth limit, i.e. the ql2xmaxqdepth parameter. This should be set to the desired per LUN queue depth divided by the number of paths to the LUN. For example, using SDD with 4 paths to the LUN and the Qlogic ql2xmaxqdepth value of 32 as the desired value, the new Qlogic queue depth limit would be set to 32 / 4 = 8. Note, that the appropriate queue depth is not only determined by the number of paths to each LUN, but the number of LUNS per host and the number of hosts connected to the storage should also be considered. You can adjust the queue depth limit by reloading the Qlogic qla2xxx driver with the ql2xmaxqdepth parameter specified. One method is to specify ql2xmaxqdepth at the command line when loading the qla2xxx driver: modprobe qla2xxx ql2xmaxqdepth=[new_queue_depth] The other method is to add the line: options qla2xxx ql2xmaxqdepth=[new_queue_depth] in /etc/modprobe.conf, before reloading the driver with modprobe. =============================================================================== 3.5 Correction to User's Guide o Supported Filesystem Statement In the current User's Guide, we make various statements regarding specific filesystem support. For Linux 2.6 kernels (the SLES 9 and RHEL 4 distributions) SDD currently only supports the following filesystems: o ext2 o ext3 Please ensure that you do not run any other filesystems on your SDD vpath devices. o Change in the path state transition During path open or sddsrv path probing, SDD will no longer skip CLOSE OFFLINE paths. Their state will be changed to INVALID if open fails or DEAD if open succeeds and mode remains in OFFLINE. CLOSED_DEAD path will stay in CLOSED_DEAD during vpath open, instead of changing into INVALID state. ------------------------------------------------------------------------------ 4.0 User License Agreement for IBM Device Drivers See LICENSE file located in /opt/IBMsdd. ------------------------------------------------------------------------------- 5.0 Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services,or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to: IBM World Trade Asia Corporation Licensing 2-31 Roppongi 3-chome, Minato-ku Tokyo 106, Japan The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND,EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact: IBM Corporation Information Enabling Requests Dept. DZWA 5600 Cottle Road San Jose, CA 95193 U.S.A. Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee. The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM License Agreement for Non-Warranted Programs. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurement may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. =============================================================================== IBM agreement for licensed internal code +---- Read Before Using -----------------------------------------------+ |IMPORTANT | | | |YOU ACCEPT THE TERMS OF THIS IBM LICENSE AGREEMENT FOR MACHINE CODE BY| |YOUR USE OF THE HARDWARE PRODUCT OR MACHINE CODE. PLEASE READ THE | |AGREEMENT CONTAINED IN THIS BOOK BEFORE USING THE HARDWARE PRODUCT.SEE| |IBM agreement for licensed internal code. | | | +----------------------------------------------------------------------+ You accept the terms of this Agreement(3) by your initial use of a machine that contains IBM Licensed Internal Code (called "Code"). These terms apply to Code used by certain machines IBM or your reseller specifies (called "Specific Machines"). International Business Machines Corporation or one of its subsidiaries ("IBM") owns copyrights in Code or has the right to license Code. IBM or a third party owns all copies of Code, including all copies made from them. If you are the rightful possessor of a Specific Machine, IBM grants you a license to use the Code (or any replacement IBM provides) on, or in conjunction with, only the Specific Machine for which the Code is provided. IBM licenses the Code to only one rightful possessor at a time. Under each license, IBM authorizes you to do only the following: 1. execute the Code to enable the Specific Machine to function according to its Official Published Specifications (called "Specifications"); 2. make a backup or archival copy of the Code (unless IBM makes one available for your use), provided you reproduce the copyright notice and any other legend of ownership on the copy. You may use the copy only to replace the original, when necessary; and 3. execute and display the Code as necessary to maintain the Specific Machine. You agree to acquire any replacement for, or additional copy of, Code directly from IBM in accordance with IBM's standard policies and practices. You also agree to use that Code under these terms. You may transfer possession of the Code to another party only with the transfer of the Specific Machine. If you do so, you must 1) destroy all your copies of the Code that were not provided by IBM, 2) either give the other party all your IBM-provided copies of the Code or destroy them, and 3) notify the other party of these terms. IBM licenses the other party when it accepts these terms. These terms apply to all Code you acquire from any source. Your license terminates when you no longer rightfully possess the Specific Machine. Actions you must not take You agree to use the Code only as authorized above. You must not do, for example, any of the following: 1. Otherwise copy, display, transfer, adapt, modify, or distribute the Code (electronically or otherwise), except as IBM may authorize in the Specific Machine's Specifications or in writing to you; 2. Reverse assemble, reverse compile, or otherwise translate the Code unless expressly permitted by applicable law without the possibility of contractual waiver; 3. Sublicense or assign the license for the Code; or 4. Lease the Code or any copy of it. ------------------------------------------------------------------------------- 6.0 Trademarks and service marks The following terms are trademarks of the International Business Machines Corporation in the United States,other countries, or both: AIX AS/400 Enterprise Storage Server HACMP/6000 IBM IBM logo iSeries Netfinity NetVista Operating System/400 pSeries RS/6000 Seascape SP System/360 System/370 System/390 The eServer logo TotalStorage Versatile Storage Server xSeries zSeries z/Architecture z/OS Microsoft, Windows, Windows NT, and the Windows logo are registered trademarks of Microsoft Corporation. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Other company, product, and service names may be trademarks or service marks of others. ------------------------------------------------------------------------------- (C) Copyright IBM Corporation 2000, 2002, 2003, 2004, 2005. All rights reserved.