Knowledge Center Contents Previous Next Index |
Achieving Performance and Scalability
Contents
- Optimizing Performance in Large Sites
- Tuning UNIX for Large Clusters
- Tuning LSF for Large Clusters
- Monitoring Performance Metrics in Real Time
Optimizing Performance in Large Sites
As your site grows, you must tune your LSF cluster to support a large number of hosts and an increased workload.
This chapter discusses how to efficiently tune querying, scheduling, and event logging in a large cluster that scales to 5000 hosts and 100,000 jobs at any one time.
To target performance optimization to a cluster with 5000 hosts and 100,000 jobs, you must:
- Configure your operating system. See Tuning UNIX for Large Clusters
- Fine-tune LSF. See Tuning LSF for Large Clusters
What's new in LSF performance?
LSF provides parameters for tuning your cluster, which you will learn about in this chapter. However, before you calculate the values to use for tuning your cluster, consider the following enhancements to the general performance of LSF daemons, job dispatching, and event replaying:
- Both scheduling and querying are much faster
- Switching and replaying the events log file,
lsb.events
, is much faster. The length of the events file no longer impacts performance- Restarting and reconfiguring your cluster is much faster
- Job submission time is constant. It does not matter how many jobs are in the system. The submission time does not vary.
- The scalability of load updates from the slaves to the master has increased
- Load update intervals are scaled automatically
The following graph shows the improvement in LIM startup after the LSF performance enhancements:
![]()
Tuning UNIX for Large Clusters
The following hardware and software specifications are requirements for a large cluster that supports 5,000 hosts and 100,000 jobs at any one time.
In this section
Hardware recommendation
LSF master host:
- 4 processors, one each for:
mbatchd
mbschd
lim
- Operating system
- 10 GB Ram
Software requirement
To meet the performance requirements of a large cluster, increase the file descriptor limit of the operating system.
The file descriptor limit of most operating systems used to be fixed, with a limit of 1024 open files. Some operating systems, such as Linux and AIX, have removed this limit, allowing you to increase the number of file descriptors.
Increase the file descriptor limit
- To achieve efficiency of performance in LSF, follow the instructions in your operating system documentation to increase the number of file descriptors on the LSF master host.
tip:
To optimize your configuration, set your file descriptor limit to a value at least as high as the number of hosts in your cluster.The following is an example configuration. The instructions for different operating systems, kernels, and shells are varied. You may have already configured the host to use the maximum number of file descriptors that are allowed by the operating system. On some operating systems, the limit is configured dynamically.
Your cluster size is 5000 hosts. Your master host is on Linux, kernel version 2.4:
- Log in to the LSF master host as the
root
user.- Add the following line to your
/etc/rc.d/rc.local
startup script:echo -n "5120" > /proc/sys/fs/file-max
- Restart the operating system to apply the changes.
- In the
bash
shell, instruct the operating system to use the new file limits:# ulimit -n unlimited
Tuning LSF for Large Clusters
To enable and sustain large clusters, you need to tune LSF for efficient querying, dispatching, and event log management.
In this section
- Managing scheduling performance
- Limiting the number of batch queries
- Improving the speed of host status updates
- Managing your user's ability to move jobs in a queue
- Managing the number of pending reasons
- Achieving efficient event switching
- Automatic load updating
- Managing the I/O performance of the info directory
- Processor binding for LSF job processes
- Increasing the job ID limit
Managing scheduling performance
For fast job dispatching in a large cluster, configure the following parameters:
LSB_MAX_JOB_DISPATCH_PER_SESSION
inlsf.conf
The maximum number of jobs the scheduler can dispatch in one scheduling session
Some operating systems, such as Linux and AIX, let you increase the number of file descriptors that can be allocated on the master host. You do not need to limit the number of file descriptors to 1024 if you want fast job dispatching. To take advantage of the greater number of file descriptors, you must set
LSB_MAX_JOB_DISPATCH_PER_SESSION
to a value greater than 300.Set
LSB_MAX_JOB_DISPATCH_PER_SESSION
to one-half the value ofMAX_SBD_CONNS
. This setting configuresmbatchd
to dispatch jobs at a high rate while maintaining the processing speed of othermbatchd
tasks.
MAX_SBD_CONNS
inlsb.params
The maximum number of open file connections between
mbatch
andsbatchd
.Specify a value equal to the number of hosts in your cluster plus a buffer. For example, if your cluster includes 4000 hosts, set:
MAX_SBD_CONNS=4100
Highly recommended for large clusters to decrease the load on the master LIM. Forces the client
sbatchd
to contact the local LIM for host status and load information. The client sbatchd only contacts the master LIM or a LIM on one of the LSF_SERVER_HOSTS if sbatchd cannot find the information locally.Enable fast job dispatch
- Log in to the LSF master host as the
root
user.- Increase the system-wide file descriptor limit of your operating system if you have not already done so.
- In
lsb.params
, setMAX_SBD_CONNS
equal to the number of hosts in the cluster plus a buffer.- In
lsf.conf
, set the parameterLSB_MAX_JOB_DISPATCH_PER_SESSION
to a value greater than 300 and less than or equal to one-half the value ofMAX_SBD_CONNS
.For example, for a cluster with 4000 hosts:
LSB_MAX_JOB_DISPATCH_PER_SESSION = 2050 MAX_SBD_CONNS=4100- In
lsf.conf
, define the parameterLSF_SERVER_HOSTS
to decrease the load on the master LIM.- In the shell you used to increase the file descriptor limit, shut down the LSF
batch
daemons on the master host:
badmin hshutdown
- Run
badmin mbdrestart
to restart the LSFbatch
daemons on the master host.- Run
badmin hrestart all
to restart everysbatchd
in the cluster:
note:
When you shut down the batch daemons on the master host, all LSF services are temporarily unavailable, but existing jobs are not affected. Whenmbatchd
is later started bysbatchd
, its previous status is restored and job scheduling continues.Enable continuous scheduling
- To enable the scheduler to run continuously, define the parameter
JOB_SCHEDULING_INTERVAL=0
inlsb.params
.Limiting the number of batch queries
In large clusters, job querying can grow very quickly. If your site sees a lot of high traffic job querying, you can tune LSF to limit the number of job queries that
mbatchd
can handle. This helps decrease the load on the master host.If a job information query is sent after the limit has been reached, an error message is displayed and
mbatchd
keeps retrying, in one second intervals. If the number of job queries later drops below the limit,mbatchd
handles the query.You define the maximum number of concurrent jobs queries to be handled by
mbatchd
in the parameterMAX_CONCURRENT_JOB_QUERY
inlsb.params
:
- If
mbatchd
is using multithreading, a dedicated query port is defined by the parameterLSB_QUERY_PORT
inlsf.conf
. Whenmbatchd
has a dedicated query port, the value ofMAX_CONCURRENT_JOB_QUERY
sets the maximum number of queries that can be handled by each childmbatchd
that is forked bymbatchd
. This means that the total number of job queries handled can be more than the number specified byMAX_CONCURRENT_JOB_QUERY (MAX_CONCURRENT_JOB_QUERY
multiplied by the number of child daemons forked bymbatchd
).- If
mbatchd
is not using multithreading, the value ofMAX_CONCURRENT_JOB_QUERY
sets the maximum total number of job queries that can be handled bymbatchd
Syntax
MAX_CONCURRENT_JOB_QUERY=max_query
Where:
max_query
Specifies the maximum number of job queries that can be handled by
mbatchd
. Valid values are positive integers between 1 and 100. The default value is unlimited.Examples
MAX_CONCURRENT_JOB_QUERY=20
Specifies that no more than 20 queries can be handled by
mbatchd
.MAX_CONCURRENT_JOB_QUERY=101
Incorrect value. The default value will be used. An unlimited number of job queries will be handled by
mbatchd
.Improving the speed of host status updates
To improve the speed with which
mbatchd
obtains and reports host status, configure the parameter LSB_SYNC_HOST_STAT_LIM in the filelsb.params
. This also improves the speed with which LSF reschedules jobs: the sooner LSF knows that a host has become unavailable, the sooner LSF reschedules any rerunnable jobs executing on that host.For example, during maintenance operations, the cluster administrator might need to shut down half of the hosts at once. LSF can quickly update the host status and reschedule any rerunnable jobs that were running on the unavailable hosts.
When you define this parameter,
mbatchd
periodically obtains the host status from the master LIM, and then verifies the status by polling eachsbatchd
at an interval defined by the parameters MBD_SLEEP_TIME and LSB_MAX_PROBE_SBD.Managing your user's ability to move jobs in a queue
JOB_POSITION_CONTROL_BY_ADMIN=Y
allows an LSF administrator to control whether users can usebtop
andbbot
to move jobs to the top and bottom of queues. When set, only the LSF administrator (including any queue administrators) can usebbot
andbtop
to move jobs within a queue. A user attempting to userbbot
orbtop
receives the error "User permission denied."
remember:
You must be an LSF administrator to set this parameter.Managing the number of pending reasons
For efficient, scalable management of pending reasons, use
CONDENSE_PENDING_REASONS=Y
inlsb.params
to condense all the host-based pending reasons into one generic pending reason.If a job has no other main pending reason,
bjobs -p
orbjobs -l
will display the following:Individual host based reasonsIf you condense host-based pending reasons, but require a full pending reason list, you can run the following command:
badmin diagnose
<job_ID>
remember:
You must be an LSF administrator or a queue administrator to run this command.Achieving efficient event switching
Periodic switching of the event file can weaken the performance of
mbatchd
,which automatically backs up and rewrites the events file after every 1000 batch job completions. The oldlsb.events
file is moved tolsb.events.1
, and each oldlsb.events.
n
file is moved tolsb.events.
n+1
.Change the frequency of event switching with the following two parameters in
lsb.params
:
MAX_JOB_NUM
specifies the number of batch jobs to complete beforelsb.events
is backed up and moved tolsb.events.1
. The default value is 1000MIN_SWITCH_PERIOD
controls how frequentlymbatchd
checks the number of completed batch jobsThe two parameters work together. Specify the
MIN_SWITCH_PERIOD
value in seconds.For example:
MAX_JOB_NUM=1000 MIN_SWITCH_PERIOD=7200This instructs
mbatchd
to check if the events file has logged 1000 batch job completions every two hours. The two parameters can control the frequency of the events file switching as follows:
- After two hours,
mbatchd
checks the number of completed batch jobs. If 1000 completed jobs have been logged, it switches the events file- If 1000 jobs complete after five minutes,
mbatchd
does not switch the events file until till the end of the two-hour period
tip:
For large clusters, set the MIN_SWITCH_PERIOD to a value equal to or greater than 600. This causesmbatchd
to fork a child process that handles event switching, thereby reducing the load onmbatchd
.mbatchd
terminates the child process and appends delta events to new events after the MIN_SWITCH_PERIOD has elapsed. If you define a value less than 600 seconds, mbatchd will not fork a child process for event switching.Automatic load updating
Periodically, the LIM daemons exchange load information. In large clusters, let LSF automatically load the information by dynamically adjusting the period based on the load.
important:
For automatic tuning of the loading interval, make sure the parameterEXINTERVAL
inlsf.cluster.
cluster_name
file isnot
defined. Do not configure your cluster to load the information at specific intervals.Managing the I/O performance of the info directory
In large clusters, there are large numbers of jobs submitted by its users. Since each job generally has a job file, this results in a large number of job files stored in the
LSF_SHAREDIR/cluster_name/logdir/info
directory at any time. When the total size of the job files reaches a certain point, you will notice a significant delay when performing I/O operations in theinfo
directory.This delay is caused by a limit in the total size of files that can reside in a file server directory. This limit is dependent on the file system implementation. A high load on the file server delays the master batch daemon operations, and therefore slows down the overall cluster throughput.
You can prevent this delay by creating and using subdirectories under the parent directory. Each new subdirectory is subject to the file size limit, but the parent directory is not subject to the total file size of its subdirectories. Since the total file size of the
info
directory is divided among its subdirectories, your cluster can process more job operations before reaching the total size limit of the job files.If your cluster has a lot of jobs resulting in a large
info
directory, you can tune your cluster by enabling LSF to create subdirectories in theinfo
directory. UseMAX_INFO_DIRS
inlsb.params
to create the subdirectories and enablembatchd
to distribute the job files evenly throughout the subdirectories.Syntax
MAX_INFO_DIRS=num_subdirs
Where
num_subdirs
specifies the number of subdirectories that you want to create under theLSF_SHAREDIR/cluster_name/logdir/info
directory. Valid values are positive integers between1
and1024
. By default, MAX_INFO_DIRS is not defined.Run
badmin reconfig
to create and use the subdirectories.Duplicate event logging
note:
If you enabled duplicate event logging, you must run badmin mbdrestart instead of badming reconfig to restart mbatchd.Run
bparams -l
to display the value of theMAX_INFO_DIRS
parameter.Example
MAX_INFO_DIRS=10
mbatchd
creates ten subdirectories fromLSB_SHAREDIR/
cluster_name
/logdir/info
/0
toLSB_SHAREDIR/
cluster_name
/logdir/info
/9
.Processor binding for LSF job processes
See also Processor Binding for Parallel Jobs.
Rapid progress of modern processor manufacture technologies has enabled the low cost deployment of LSF on hosts with multicore and multithread processors. The default soft affinity policy enforced by the operating system scheduler may not give optimal job performance. For example, the operating system scheduler may place all job processes on the same processor or core leading to poor performance. Frequently switching processes as the operating system schedules and reschedules work between cores can cause cache invalidations and cache miss rates to grow large.
Processor binding for LSF job processes takes advantage of the power of multiple processors and multiple cores to provide hard processor binding functionality for sequential LSF jobs and parallel jobs that run on a single host.
restriction:
Processor binding is supported on hosts running Linux with kernel version 2.6 or higher.For multi-host parallel jobs, LSF sets two environment variables (
$LSB_BIND_JOB
and$LSB_BIND_CPU_LIST
) but does not attempt to bind the job to any host.When processor binding for LSF job processes is enabled on supported hosts, job processes of an LSF job are bound to a processor according to the binding policy of the host. When an LSF job is completed (exited or done successfully) or suspended, the corresponding processes are unbound from the processor.
When a suspended LSF job is resumed, the corresponding processes are bound again to a processor. The process is not guaranteed to be bound to the same processor it was bound to before the job was suspended.
The processor binding affects the whole job process group. All job processes forked from the root job process (the job RES) are bound to the same processor.
Processor binding for LSF job processes does not bind daemon processes.
If processor binding is enabled, but the execution hosts do not support processor affinity, the configuration has no effect on the running processes. Processor binding has no effect on a single-processor host.
Processor, core, and thread-based CPU binding
By default, the number of CPUs on a host represents the number of physical processors a machine has. For LSF hosts with multiple cores, threads, and processors,
ncpus
can be defined by the cluster administrator to consider one of the following:
- Processors
- Processors and cores
- Processors, cores, and threads
Globally, this definition is controlled by the parameter
EGO_DEFINE_NCPUS
inlsf.conf
orego.conf
. The default behavior forncpus
is to consider only the number of physical processors (EGO_DEFINE_NCPUS=procs
).
tip:
WhenPARALLEL_SCHED_BY_SLOT=Y
inlsb.params
, the resource requirement string keywordncpus
refers to the number of slots instead of the number of processors, howeverlshosts
output will continue to showncpus
as defined byEGO_DEFINE_NCPUS
inlsf.conf
.Binding job processes randomly to multiple processors, cores, or threads, may affect job performance. Processor binding configured with LSF_BIND_JOB in
lsf.conf
or BIND_JOB inlsb.applications
, detects the EGO_DEFINE_NCPUS policy to bind the job processes by processor, core, or thread (PCT).For example, if a host's PCT policy is set to processor (EGO_DEFINE_NCPUS=procs) and the binding option is set to BALANCE, the first job process is bound to the first physical processor, the second job process is bound to the second physical processor and so on.
If host's PCT policy is set to core level (EGO_DEFINE_NCPUS=cores) and the binding option is set to BALANCE, the first job process is bound to the first core on the first physical processor, the second job process is bound to the first core on the second physical processor, the third job process is bound to the second core on the first physical processor and so on.
If host's PCT policy is set to thread level (EGO_DEFINE_NCPUS=threads) and the binding option is set to BALANCE, the first job process is bound to the first thread on the first physical processor, the second job process is bound to the first thread on the second physical processor, the third job process is bound to the second thread on the first physical processor and so on.
BIND_JOB=BALANCE
The BIND_JOB=BALANCE option instructs LSF to bind the job based on the load of the available processors/cores/threads. For each slot:
- If the PCT level is set to processor, the lowest loaded physical processor runs the job.
- If the PCT level is set to core, the lowest loaded core on the lowest loaded processor runs the job.
- If the PCT level is set to thread, the lowest loaded thread on the lowest loaded core on the lowest loaded processor runs the job.
If there is a single 2 processor quad core host and you submit a parallel job with
-n 2 -R"span[hosts=1]"
when the PCT level is core, the job is bound to the first core on the first processor and the first core on the second processor:
![]()
After submitting another three jobs with
-n 2 -R"span[hosts=1]"
:
![]()
If
PARALLEL_SCHED_BY_SLOT=Y
is set inlsb.params
, the job specifies a maximum and minimum number of job slots instead of processors. If the MXJ value is set to 16 for this host (there are 16 job slots on this host), LSF can dispatch more jobs to this host. Another job submitted to this host is bound to the first core on the first processor and the first core on the second processor:
![]()
BIND_JOB=PACK
The
BIND_JOB=PACK
option instructs LSF to try to pack all the processes onto a single processor. If this cannot be done, LSF tries to use as few processors as possible. Email is sent to you after job dispatch and when job finishes. If no processors/cores/threads are free (when the PCT level is processor/core/thread level), LSF tries to use the BALANCE policy for the new job.LSF depends on the order of processor IDs to pack jobs to a single processor.
If PCT level is processor (default value after installation), there is no difference between BALANCE and PACK.
This option binds jobs to a single processor where it makes sense, but does not oversubscribe the processors/cores/threads. The other processors are used when they are needed. For instance, when the PCT level is core level, if we have a single 4 processor quad core host and we had bound 4 sequential jobs onto the first processor, the 5th-8th sequential job is bound to the second processor.
If you submit three single-host parallel jobs with
-n 2 -R"span[hosts=1]"
when the PCT level is core level, the first job is bound to the first and seconds cores of the first processor, the second job is bound to the third and fourth cores of the first processor. Binding the third job to the first processor oversubscribes the cores in the first processor, so the third job is bound to the first and second cores of the second processor:
![]()
After JOB1 and JOB2 finished, if you submit one single-host parallel jobs with
-n 2 -R"span[hosts=1]
, the job is bound to the third and fourth cores of the second processor:
![]()
BIND_JOB=ANY
BIND_JOB=ANY
binds the job to the first N available processors/cores/threads with no regard for locality. If the PCT level is core, LSF binds the first N available cores regardless of whether they are on the same processor or not. LSF arranges the order based on APIC ID.If PCT level is processor (default value after installation), there is no difference between ANY and BALANCE.
For example, with a single 2-processor quad core host and the below table is the relationship of APIC ID and logic processor/core id:
If the PCT level is core level and you submits two jobs to this host with
-n 3 -R "span[hosts=1]"
, then the first job is bound to the first, second and third core of the first physical processor, the second job is bound to the fourth core of the first physical processor and the first, second core in the second physical processor.BIND_JOB=USER
BIND_JOB=USER
binds the job to the value of$LSB_USER_BIND_JOB
as specified in the user submission environment. This allows the Administrator to delegate binding decisions to the actual user. This value must be one of Y, N, NONE, BALANCE, PACK, or ANY. Any other value is treated as ANY.BIND_JOB=USER_CPU_LIST
BIND_JOB=USER_CPU_LIST
binds the job to the explicit logic CPUs specified in environment variable$LSB_USER_BIND_CPU_LIST
. LSF does not check that the value is valid for the execution host(s). It is the user's responsibility to correctly specify the CPU list for the hosts they select.The correct format of
$LSB_USER_BIND_CPU_LIST
is a list which may contain multiple items, separated by comma, and ranges. For example, 0,5,7,9-11.If the value's format is not correct or there is no such environment variable, jobs are not bound to any processor.
If the format is correct and it cannot be mapped to any logic CPU, the binding fails. But if it can be mapped to some CPUs, the job is bound to the mapped CPUs. For example, with a two-processor quad core host and the logic CPU ID is 0-7:
- If user1 specifies 9,10 into
$LSB_USER_BIND_CPU_LIST
, his job is not bound to any CPUs.- If user2 specifies 1,2,9 into
$LSB_USER_BIND_CPU_LIST
, his job is bound to CPU 1 and 2.If the value's format is not correct or it does not apply for the execution host, the related information is added to the email sent to users after job dispatch and job finish.
If user specifies a minimum and a maximum number of processors for a single-host parallel job, LSF may allocate processors between these two numbers for the job. In this case, LSF binds the job according to the CPU list specified by the user.
BIND_JOB=NONE
BIND_JOB=NONE
is functionally equivalent to the formerBIND_JOB=N
where the processor binding is disabled.Feature interactions
- Existing CPU affinity features
- Processor binding of LSF job processes will not take effect on a master host with the following parameters configured.
- MBD_QUERY_CPUS
- LSF_DAEMONS_CPUS
- EGO_DAEMONS_CPUS
- IRIX cpusets
- Processor binding cannot be used with IRIX cpusets. If an execution host is configured as part of a cpuset, processor binding is disabled on that host.
- Job requeue, rerun, and migration
- When a job is requeued, rerun or migrated, a new job process is created. If processor binding is enabled when the job runs, the job processes will be bound to a processor.
badmin hrestart
badmin hrestart
restarts a new sbatchd. If a job process has already been bound to a processor, after sbatchd is restarted, processor binding for the job processes are restored.badmin reconfig
- If the BIND_JOB parameter is modified in an application profile,
badmin reconfig
only affects pending jobs. The change does not affect running jobs.- MultiCluster job forwarding model
- In a MultiCluster environment, the behavior is similar to the current application profile behavior. If the application profile name specified in the submission cluster is not defined in the execution cluster, the job is rejected. If the execution cluster has the same application profile name, but does not enable processor binding, the job processes are not bound at the execution cluster.
Enable processor binding for LSF job processes
LSF supports the following binding options for sequential jobs and parallel jobs that run on a single host:
- BALANCE
- PACK
- ANY
- USER
- USER_CPU_LIST
- NONE
- Enable processor binding cluster-wide or in an application profile.
- Cluster-wide configuration (
lsf.conf
)- Define LSF_BIND_JOB in
lsf.con
f to enable processor binding for all execution hosts in the cluster. On the execution hosts that support this feature, job processes will be hard bound to selected processors.- Application profile configuration (
lsb.applications
)- Define BIND_JOB in an application profile configuration in
lsb.applications
to enable processor binding for all jobs submitted to the application profile. On the execution hosts that support this feature, job processes will be hard bound to selected processors.If BIND_JOB is not set in an application profile in
lsb.applications
, the value of LSF_BIND_JOB inlsf.conf
takes effect. The BIND_JOB parameter configured in an application profile overrides thelsf.conf
setting.Increasing the job ID limit
By default, LSF assigns job IDs up to 6 digits. This means that no more than 999999 jobs can be in the system at once. The job ID limit is the highest job ID that LSF will ever assign, and also the maximum number of jobs in the system.
LSF assigns job IDs in sequence. When the job ID limit is reached, the count rolls over, so the next job submitted gets job ID "1". If the original job 1 remains in the system, LSF skips that number and assigns job ID "2", or the next available job ID. If you have so many jobs in the system that the low job IDs are still in use when the maximum job ID is assigned, jobs with sequential numbers could have different submission times.
Increase the maximum job ID
You cannot lower the job ID limit, but you can raise it to 10 digits. This allows longer term job accounting and analysis, and means you can have more jobs in the system, and the job ID numbers will roll over less often.
Use MAX_JOBID in
lsb.params
to specify any integer from 999999 to 2147483646 (for practical purposes, you can use any 10-digit integer less than this value).Increase the job ID display length
By default,
bjobs
andbhist
display job IDs with a maximum length of 7 characters. Job IDs greater than 9999999 are truncated on the left.Use LSB_JOBID_DISP_LENGTH in
lsf.conf
to increase the width of the JOBID column in bjobs and bhist display. When LSB_JOBID_DISP_LENGTH=10, the width of the JOBID column inbjobs
andbhist
increases to 10 characters.Monitoring Performance Metrics in Real Time
Enable metric collection
Set
SCHED_METRIC_ENABLE=Y
inlsb.params
to enable performance metric collection.Start performance metric collection dynamically:
badmin perfmon start
sample_period
Optionally, you can set a sampling period, in seconds. If no sample period is specified, the default sample period set in
SCHED_METRIC_SAMPLE_PERIOD
inlsb.params
is used.Stop sampling:
badmin perfmon stop
SCHED_METRIC_ENABLE
andSCHED_METRIC_SAMPLE_PERIOD
can be specified independently. That is, you can specifySCHED_METRIC_SAMPLE_PERIOD
and not specifySCHED_METRIC_ENABLE
. In this case, when you turn on the feature dynamically (usingbadmin perfmon start
), the sampling period valued defined inSCHED_METRIC_SAMPLE_PERIOD
will be used.
badmin perfmon start
andbadmin perfmon stop
override the configuration setting inlsb.params
. Even ifSCHED_METRIC_ENABLE
is set, if you runbadmin perfmon start
, performance metric collection is started. If you runbadmin perfmon stop
, performance metric collection is stopped.Tune the metric sampling period
Set
SCHED_METRIC_SAMPLE_PERIOD
inlsb.params
to specify an initial cluster-wide performance metric sampling period.Set a new sampling period in seconds:
badmin perfmon setperiod
sample_period
Collecting and recording performance metric data may affect the performance of LSF. Smaller sampling periods will result in the
lsb.streams
file growing faster.Display current performance
Run
badmin perfmon view
to view real time performance metric information. The following metrics are collected and recorded in each sample period:
- The number of queries handled by
mbatchd
- The number of queries for each of jobs, queues, and hosts. (
bjobs
,bqueues
, andbhosts
, as well as other daemon requests)- The number of jobs submitted (divided into job submission requests and jobs actually submitted)
- The number of jobs dispatched
- The number of jobs completed
- The numbers of jobs sent to remote cluster
- The numbers of jobs accepted by from cluster
badmin perfmon view
Performance monitor start time: Fri Jan 19 15:07:54 End time of last sample period: Fri Jan 19 15:25:55 Sample period : 60 Seconds ------------------------------------------------------------------ Metrics Last Max Min Avg Total ------------------------------------------------------------------ Total queries 0 25 0 8 159 Jobs information queries 0 13 0 2 46 Hosts information queries 0 0 0 0 0 Queue information queries 0 0 0 0 0 Job submission requests 0 10 0 0 10 Jobs submitted 0 100 0 5 100 Jobs dispatched 0 0 0 0 0 Jobs completed 0 13 0 5 100 Jobs sent to remote cluster 0 12 0 5 100 Jobs accepted from remote cluster 0 0 0 0 0 ------------------------------------------------------------------ File Descriptor Metrics Free Used Total ------------------------------------------------------------------ MBD file descriptor usage 800 424 1024Performance metrics information is calculated at the end of each sampling period. Running
badmin perfmon
before the end of the sampling period displays metric data collected from the sampling start time to the end of last sample period.If no metrics have been collected because the first sampling period has not yet ended,
badmin perfmon view
displays:badmin perfmon view
Performance monitor start time: Thu Jan 25 22:11:12 End time of last sample period: Thu Jan 25 22:11:12 Sample period : 120 Seconds ------------------------------------------------------------------ No performance metric data available. Please wait until first sample period ends.badmin perfmon output
Sample Period
Current sample period
Performance monitor start time
The start time of sampling
End time of last sample period
The end time of last sampling period
Metric
The name of metrics
Total
This is accumulated metric counter value for each metric. It is counted from Performance monitor start time to End time of last sample period.
Last Period
Last sampling value of metric. It is calculated per sampling period. It is represented as the metric value per period, and normalized by the following formula.
Max
Maximum sampling value of metric. It is re-evaluated in each sampling period by comparing Max and Last Period. It is represented as the metric value per period.
Min
Minimum sampling value of metric. It is re-evaluated in each sampling period by comparing Min and Last Period. It is represented as the metric value per period.
Avg
Average sampling value of metric. It is recalculated in each sampling period. It is represented as the metric value per period, and normalized by the following formula.
Reconfiguring your cluster with performance metric sampling enabled
badmin mbdrestart
If performance metric sampling is enabled dynamically with
badmin perfmon start
. You must enable it again after runningbadmin mbdrestart
. If performance metric sampling is enabled by default,StartTime
will be reset to the pointmbatchd
is restarted.badmin reconfig
If
SCHED_METRIC_ENABLE
andSCHED_METRIC_SAMPLE_PERIOD
parameters are changed,badmin reconfig
is the same asbadmin mbdrestart
.Performance metric logging in lsb.streams
By default, collected metrics must be written to
lsb.streams
. However, performance metric can still be turned on even ifENABLE_EVENT_STREAM=N
is defined. In this case, no metric data will be logged.
- If
EVENT_STREAM_FILE
is defined and is valid, collected metrics should be written toEVENT_STREAM_FILE
.- If
ENABLE_EVENT_STREAM=N
is defined, metrics data will not be logged.Job arrays
Only one submission request is counted. Element jobs are counted for jobs submitted, jobs dispatched, and jobs completed.
Job rerun
Job rerun occurs when execution hosts become unavailable while a job is running, and the job will be put to its original queue first and later will be dispatched when a suitable host is available. So in this case, only one submission request, one job submitted, and
n
jobs dispatched,n
jobs completed are counted (n
represents the number of times the job reruns before it finishes successfully).Job requeue
Requeued jobs may be dispatched, run, and exit due to some special errors again and again. The job data always exists in the memory, so LSF only counts one job submission request and one job submitted, and counts more than one job dispatched.
For jobs completed, if a job is requeued with
brequeue
, LSF counts two jobs completed, since requeuing a job first kills the job and later puts the job into pending list. If the job is automatically requeued, LSF counts one job completed when the job finishes successfully.Job replay
When job replay is finished, submitted jobs are not counted in job submission and job submitted, but are counted in job dispatched and job finished.
Platform Computing Inc.
www.platform.com |
Knowledge Center Contents Previous Next Index |