Troubleshooting client access to data

Use the information in this topic to troubleshoot problems that you are having with client access to data.

Problem

This topic describes how to determine why a client cannot access or update user data.

Investigation

If a client cannot create a new file
Perform the following steps until you resolve the problem:
  1. Verify that the master metadata server is online.
  2. Verify that the client is connected to the master metadata server:
    • From the client, attempt to ping or establish a Secure SHell (SSH) session with the metadata server.
    • From the Administrative command-line interface (CLI), run the lsclient command to see a list of all clients currently being served by the metadata servers in the cluster.
  3. Use the ls -l command to list the directory in which the file is to be created to verify that the path is correct and accessible.
  4. Verify that you are logged into the client with a user name that has permission to write files to directories:
    • From a UNIX® client, use the id command.
    • From a Windows® client, right-click the directory. Then click Properties > Security .
  5. Verify that your user name has permission to write files to the specific directory.
    Important: If you are logged into a UNIX client as root or a Windows client as Administrator, make sure that you are running on a privileged client.
  6. Make sure you have space in your system pool.
  7. If using hard quotas, check the quota of the fileset in which the parent directory resides to ensure that there is sufficient space to accommodate the new file and that the fileset is attached. If allocating blocks fails, check the quota. When it meets the hard quota, the client cannot allocate blocks for a file until some of the old blocks are free.
    1. Access the master metadata server through either the SAN File System console or the CLI.
    2. List the filesets to view the server to which the fileset is attached, the quota percentage and type, the attach point, and the directory:
      • From the CLI, run the sfscli lsfileset -l command.
      • From the SAN File System console, click Manage Filing > Filesets.
  8. Make sure that the storage pool in which the file is stored has sufficient space. Check the active policy set to validate the placement rule that applies to your filename.
  9. Verify that the client can access the storage device:
    • If the client is running Data Path Optimizer (DPO), use the datapath query command to ensure that the operating system can access the storage device.
    • On an AIX® client, use the stfsdisk command to determine which LUNs can be accessed. Then, use the lsvol -l command to correlate the LUN with the device. You can also use the lsdev -C command to see the physical LUNs.
    • On a Linux client, entercat /proc/fs/sanfs/client/<client name>/disk/access to determine which LUNs you can access. Then use the lsvol -1 command to correlate the LUN with the device. Also, check the System Log for any messages about inaccessible LUNs and volumes.
  10. Suspect a problem with corrupt SAN File System metadata.

If a client cannot access an existing file
Perform the following steps until you resolve the problem:
  1. Verify that your user name has permission to read that specific file.
  2. Verify that the master MDS is online.
  3. Verify that the client has access to the cluster using SSH and the lsclient command.
  4. Run the lsclient command on the server fileset to ensure that the client lease is valid with the server.
  5. If the client is trying to access root-privileged files, ensure that the client has administrative privileges (AdmPriv) on the server.
  6. Suspect a problem with corrupt SAN File System metadata.

If a client cannot update the attributes of an existing file
Perform the following steps until you resolve the problem:
  1. Verify that your user name has permission to write to the file.
  2. On a Windows client, you can change attributes on Windows domain files only. On a UNIX client, you can change attributes on UNIX domain files only.
  3. Verify that the master MDS is online.
  4. Verify that the client has access to the cluster using SSH and the lsclient command.
  5. Run the lsclient command on the server fileset to ensure that the client lease is valid with the server.
  6. If the client is trying to access root-privileged files, ensure that the client has administrative privileges (AdmPriv) on the server.
  7. Make sure you are a local root superuser or a Windows administrator. If you are and the client does not have administrative privileges, you might not be able to update the attributes of an existing file.
  8. Suspect a problem with corrupt SAN File System metadata.

If a client cannot access any data
Perform the following steps until the problem is resolved:
  1. Verify that the client can access the cluster:
    From the client, sign on to the SAN File System console. If you cannot sign on, suspect one of the following conditions:
    • One or more metadata servers in the cluster are down. See Troubleshooting the cluster.
      Tip: The administrative agent automatically attempts to restart a metadata server if it goes down for any reason. Therefore, if you cannot access a metadata server, wait a few minutes and try again to ensure that it is not in the process of restarting.
    • An IP network problem occurred between the client and the cluster. See Isolating problems with the SAN File System.
    View the system logs and look for errors that might indicate I/O errors. Attempt to resolve these errors.
    • From a client running Windows, use the Event Viewer to view the Event Log or the use the sanfstrace utility to determine errors.
    • From a client running on any supported UNIX platform, you can view logging with the syslog facility and tracing output using the sanfstrace utility.
  2. From the CLI on the master metadata server, run the lslun command to verify that the LUNs are available. To see the LUNs that are visible to each client, use the command lslun -client <client_name> for each client.
    If the LUNs are not available, you can rediscover all LUNs by running the following commands from the master metadata server:
    1. Run the stopserver command from the CLI to stop SAN File System.
    2. Run the rmmod qla2300command.
    3. Run the modprobe qla2300command.
    4. Run the /etc/rc.d/init.d/sanfs start command to start SAN File System.
    5. Run the startserver command from the CLI to start the metadata server.
  3. Run the lsclient command on the server to see if this client is recognized and has an active session with the server.
  4. Verify that the client can access the storage device:
    • If the client is running DPO, use the datapath query device command to ensure that the operating system can access the storage device.
    • On a client running AIX, use the stfsdisk command to determine which disks can be accessed. Then, from the CLI, use the lsvol -l command to correlate the disk with the device. You can also use the lsdev -C command to see the physical disks.
    • On a Linux client, entercat /proc/fs/sanfs/client/<client name>/disk/access to determine which LUNs you can access. Then, run the lsvol -1 command from the CLI to correlate the LUN with the device.

    To rediscover all LUNs on a client running AIX, run the client command stfsdisk -discover.

    To rediscover SAN File System LUNs on a Linux client, run the following command: $ echo 1 > /proc/fs/sanfs/client/<client name>/disk/discover

    The LUNs should automatically be rediscovered on a client running Windows. If they are not, use the following steps:
    1. Right-click My Computer.
    2. Click Manage.
    3. Click Storage > Disk Management.
    4. Click Action > Rescan Disks.
    5. Restart clients.
    • For clients running on UNIX, using the following steps:
      1. Run the rmstclient command to unmount the global namespace, remove the virtual client, and unload the file-system driver.
      2. Run the setupclient command to load the file-system driver, create the virtual client, and mount the global namespace.
    • On clients running Windows, restart the system.
  5. If the client cannot access user data, suspect the SAN route from the client. See Isolating problems with the SAN File System.
  6. Suspect a problem with corrupt SAN File System metadata.

If a client running Windows receives delayed write failure errors
A delayed write failure error might appear as a message box on the client desktop or in the Event Log. This message indicates that an error occurred when writing data from the local file system cache to a storage device. Perform the following steps until you resolve the problem:
  1. The message includes the name of the file where the error occurred. Note the name of this file because the data it contains might have been corrupted, and the application using this file might encounter problems using this file's data.
  2. If the file is not part of SAN File System, refer to your system documentation for resolving file system errors. If the file is part of SAN File System, view the Event Log and resolve errors that might relate to this problem.
  3. Suspect a communication problem with either the SAN or between the client and the metadata server.
If a client running UNIX receives delayed write failure errors
A delayed write-failure error appears in the System Log. These errors are usually preceded in the log by SCSI or other input/output errors.

Parent topic: Troubleshooting a SAN File System client

Related reference
lsclient
Isolating problems with the SAN File System
AIX client logging and tracing
Windows client logging and tracing

Related information
Troubleshooting the local network

Library | Support | Terms of use | Feedback
(C) Copyright IBM Corporation 2003, 2004. All Rights Reserved.
IBM TotalStorage SAN File System v2.2