This topic provides an overview of how to resolve problems with the SAN File System cluster.
The master metadata server manages system metadata for the entire cluster. It controls all operations involving system metadata such as allocation of storage space, coordination of most administrative operations, and access to the global namespace. In addition, the master metadata server can perform the same tasks that are performed by subordinate metadata servers: managing file metadata and workload for one or more filesets.
Only one metadata server at a time can act as the master in a cluster.
Metadata servers in the cluster rely on a heartbeat mechanism to verify availability. When a metadata server becomes unresponsive or fails, there are several possible reasons:
If a local network connection fails or a network partition occurs, all metadata servers can function until they need to communicate with the master. When a metadata server continues to function but is not reachable from the cluster, the metadata server could potentially cause metadata corruption if not stopped. Such a metadata server is referred to as a rogue metadata server.
The cluster must guarantee that no server is rogue for all failure scenarios. To do this, SAN File System uses one of two possible containment scenarios:
Troubleshooting a metadata server
Use the information in this topic to troubleshoot problems that you are having with a metadata server.Troubleshooting the local network
Use the information in this topic to troubleshoot problems that you are having with the local network.Resolution Procedures
This topic describes resolution procedures that can assist you in resolving problems with the metadata server.
Parent topic: Troubleshooting
Related reference
Commands