Knowledge Center Contents Previous Next Index |
Managing Software Licenses with LSF
Software licenses are valuable resources that must be fully utilized. This section discusses how LSF can help manage licensed applications to maximize utilization and minimize job failure due to license problems.
Contents
- Using Licensed Software with LSF
- Host-locked Licenses
- Counted Host-Locked Licenses
- Network Floating Licenses
Using Licensed Software with LSF
Many applications have restricted access based on the number of software licenses purchased. LSF can help manage licensed software by automatically forwarding jobs to licensed hosts, or by holding jobs in batch queues until licenses are available.
Host-locked Licenses
Host-locked software licenses allow users to run an unlimited number of copies of the product on each of the hosts that has a license.
Configuring host-locked licenses
You can configure a Boolean resource to represent the software license, and configure your application to require the license resource. When users run the application, LSF chooses the best host from the set of licensed hosts.
See Boolean resources for information about configuring Boolean resources.
See the
Platform LSF Configuration Reference
for information about thelsf.task
file and instructions on configuring resource requirements for an application.Counted Host-Locked Licenses
Counted host-locked licenses are only available on specific licensed hosts, but also place a limit on the maximum number of copies available on the host.
Configuring counted host-locked licenses
You configure counted host-locked licenses by having LSF determine the number of licenses currently available. Use either of the following to count the host-locked licenses:
- External LIM (ELIM)
- A
check_licenses
shell scriptUsing an External LIM (ELIM)
To use an external LIM (ELIM) to get the number of licenses currently available, configure an external load index
licenses
giving the number of free licenses on each host. To restrict the application to run only on hosts with available licenses, specifylicenses>=1
in the resource requirements for the application.See External Load Indices for instructions on writing and using an ELIM and configuring resource requirements for an application.
See the
Platform LSF Configuration Reference
for information about thelsf.task
file.Using a check_license script
There are two ways to use a
check_license
shell script to check license availability and acquire a license if one is available:
- Configure the
check_license
script as a job-level pre-execution command when submitting the licensed job:bsub -m
licensed_hosts
-E check_license
licensed_job
Configure the check_license
script as a queue-level pre-execution command. See Configuring Pre- and Post-Execution Commands for information about configuring queue-level pre-execution commands.It is possible that the license becomes unavailable between the time the
check_license
script is run, and when the job is actually run. To handle this case, configure a queue so that jobs in this queue will be requeued if they exit with values indicating that the license was not successfully obtained.See Automatic Job Requeue for more information.
Network Floating Licenses
A network floating license allows a fixed number of machines or users to run the product at the same time, without restricting which host the software can run on. Floating licenses are cluster-wide resources; rather than belonging to a specific host, they belong to all hosts in the cluster.
LSF can be used to manage floating licenses using the following LSF features:
- Shared resources
- Resource reservation
- Job requeuing
Using LSF to run licensed software can improve the utilization of the licenses. The licenses can be kept in use 24 hours a day, 7 days a week. For expensive licenses, this increases their value to the users. Floating licenses also increase productivity, because users do not have to wait for a license to become available.
LSF jobs can make use of floating licenses when:
- All license jobs are run through LSF
- Licenses are managed outside of LSF control
All licenses used through LSF
If all jobs requiring licenses are submitted through LSF, then LSF could regulate the allocation of licenses to jobs and ensure that a job is not started if the required license is not available. A static resource is used to hold the total number of licenses that are available. The static resource is used by LSF as a counter which is decremented by the resource reservation mechanism each time a job requiring that resource is started.
Example
For example, suppose that there are 10 licenses for the
Verilog
package shared by all hosts in the cluster. The LSF configuration files should be specified as shown below. The resource is a static value, so an ELIM is not necessary.lsf.shared
Begin Resource RESOURCENAME TYPE INTERVAL INCREASING DESCRIPTION verilog Numeric () N (Floating licenses for Verilog) End Resourcelsf.cluster.cluster_name
Begin ResourceMap RESOURCENAME LOCATION verilog (10@[all]) End ResourceMapSubmitting jobs
The users would submit jobs requiring
verilog
licenses as follows:bsub -R "rusage[verilog=1]" myprog
Licenses used outside of LSF control
To handle the situation where application licenses are used by jobs outside of LSF, use an ELIM to dynamically collect the actual number of licenses available instead of relying on a statically configured value. The ELIM periodically informs LSF of the number of available licenses, and LSF takes this into consideration when scheduling jobs.
Example
Assuming there are a number of licenses for the
Verilog
package that can be used by all the hosts in the cluster, the LSF configuration files could be set up to monitor this resource as follows:lsf.shared
Begin Resource RESOURCENAME TYPE INTERVAL INCREASING DESCRIPTION verilog Numeric 60 N (Floating licenses for Verilog) End Resourcelsf.cluster.
cluster_name
Begin ResourceMap RESOURCENAME LOCATION verilog ([all]) End ResourceMapThe INTERVAL in the
lsf.shared
file indicates how often the ELIM is expected to update the value of theVerilog
resource - in this case every 60 seconds. Since this resource is shared by all hosts in the cluster, the ELIM only needs to be started on the master host. If theVerilog
licenses can only be accessed by some hosts in the cluster, specify the LOCATION field of theResourceMap
section as([hostA hostB hostC ...])
. In this case an ELIM is only started on hostA.Submitting jobs
The users would submit jobs requiring
verilog
licenses as follows:bsub -R "rusage[verilog=1:duration=1]" myprog
Configuring a dedicated queue for floating licenses
Whether you run all license jobs through LSF or run jobs that use licenses that are outside of LSF control, you can configure a dedicated queue to run jobs requiring a floating software license.
For each job in the queue, LSF reserves a software license before dispatching a job, and releases the license when the job finishes.
Use the
bhosts -s
command to display the number of licenses being reserved by the dedicated queue.Example
The following example defines a queue named
q_verilog
inlsb.queues
dedicated to jobs that requireVerilog
licenses:Begin Queue QUEUE_NAME = q_verilog RES_REQ=rusage[verilog=1:duration=1] End QueueThe queue named
q_verilog
contains jobs that will reserve oneVerilog
license when it is started.If the
Verilog
licenses are not cluster-wide, but can only be used by some hosts in the cluster, the resource requirement string should include thedefined()
tag in theselect
section:select[defined(verilog)] rusage[verilog=1]Preventing underutilization of licenses
One limitation to using a dedicated queue for licensed jobs is that if a job does not actually use the license, then the licenses will be under-utilized. This could happen if the user mistakenly specifies that their application needs a license, or submits a non-licensed job to a dedicated queue.
LSF assumes that each job indicating that it requires a
Verilog
license will actually use it, and simply subtracts the total number of jobs requestingVerilog
licenses from the total number available to decide whether an additional job can be dispatched.Use the
duration
keyword in the queue resource requirement specification to release the shared resource after the specified number of minutes expires. This prevents multiple jobs started in a short interval from over-using the available licenses. By limiting the duration of the reservation and using the actual license usage as reported by the ELIM, underutilization is also avoided and licenses used outside of LSF can be accounted for.When interactive jobs compete for licenses
In situations where an interactive job outside the control of LSF competes with batch jobs for a software license, it is possible that a batch job, having reserved the software license, may fail to start as its license is intercepted by an interactive job. To handle this situation, configure job requeue by using the REQUEUE_EXIT_VALUES parameter in a queue definition in
lsb.queues
. If a job exits with one of the values in the REQUEUE_EXIT_VALUES, LSF will requeue the job.Example
Jobs submitted to the following queue will use
Verilog
licenses:Begin Queue QUEUE_NAME = q_verilog RES_REQ=rusage[verilog=1:duration=1] # application exits with value 99 if it fails to get license REQUEUE_EXIT_VALUES = 99 JOB_STARTER = lic_starter End QueueAll jobs in the queue are started by the job starter
lic_starter
, which checks if the application failed to get a license and exits with an exit code of 99. This causes the job to be requeued and LSF will attempt to reschedule it at a later time.lic_starter job starter script
The
lic_starter
job starter can be coded as follows:#!/bin/sh # lic_starter: If application fails with no license, exit 99, # otherwise, exit 0. The application displays # "no license" when it fails without license available. $* 2>&1 | grep "no license" if [ $? != "0" ] then exit 0 # string not found, application got the license else exit 99 fiFor more information
- See Automatic Job Requeue for more information about configuring job requeue
- See Chapter 39, "Job Starters" for more information about LSF job starters
Platform Computing Inc.
www.platform.com |
Knowledge Center Contents Previous Next Index |