Understanding the workload

This topic helps you determine the workload generated by various application types.

Context

The first step in sizing is to get a feel for how many objects you have to deal with at a steady state workload or a peak workload. Some sizing exercises are better done by using peak application workloads rather than steady state workloads. The choice is up to you. The application should be sized with respect to name, space, or file system objects. You should obtain the hotset information because this number can have a direct impact on the metadata cache performance. The hotset is the number of active objects that the application will be accessing during its peak or steady state operation. The exact determination of the hotset depends upon the type of application that you are using. For example, for a mail server application, although the installation may have millions of user accounts, if at any time only ten percent of the users are active, then the hotset is ten percent of the total number of objects.

To classify an application, you should essentially attempt to break down the application’s interaction with a generic file system. Having detailed information about the application makes it easier. From a sizing perspective, it is important to understand the mix of file operations generated by the application. For your reference, the following table provides a sample of typical application types and the mix of file operations that they typically generate. Try to place your application in one of the categories listed. If it does not fall in any of the categories listed, try to gather similar data for your application type.

Op Type Spec 1997 Web server Web proxy Database (OLTP) Peer-Peer Mail server News server D bench Warehouse (DB2®) using Direct I/O and DMS Warehouse (DB2) using Direct I/O and DMS
lookup 27% 14% 14% 0% 1% 27% 1% 61% 1% 13%
read 18% 28% 6% 61% 54% 14% 22% 3% 15% 0%
read direct 0% 0% 0% 0% 0% 0% 0% 0% 32% 78%
write 9% 0% 23% 31% 35% 24% 64% 16% 32% 4%
getattr 11% 55% 18% 3% 1% 3% 0% 7% 0% 0%
readlink 7% 0% 0% 0% 0% 0% 0% 8% 0% 0%
readdir 2% 1% 1% 0% 0% 0% 1% 0% 0% 0%
create 1% 0% 11% 0% 1% 0% 1% 1% 0% 0%
remove 1% 0% 11% 0% 1% 0% 1% 1% 0% 0%
mkdir 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%
fsstat 1% 1% 1% 0% 1% 1% 1% 0% 0% 0%
setattr 1% 0% 0% 0% 0% 4% 0% 0% 0% 0%
readdir plus 9% 0% 0% 0% 0% 0% 0% 0% 0% 0%
access 7% 1% 4% 1% 1% 3% 1% 1% 0% 1%
commit 5% 0% 1% 4% 5% 24% 8% 0% 0% 0%
map/ unmap 0% 0% 0% 0% 0% 0% 0% 0% 0% 3%

Parent topic: Sizing the metadata servers based on application workload

Library | Support | Terms of use | Feedback
(C) Copyright IBM Corporation 2003, 2004. All Rights Reserved.
IBM TotalStorage SAN File System v2.2