Participating clusters must use the same port numbers for the daemons LIM, RES, and MBD.
By default, all clusters have the identical settings, as shown:
The default for LSF_LIM_PORT changed in LSF Version 7.0 to accommodate Platform EGO default port configuration. On EGO, default ports start with lim at 7869, and are numbered consecutively for the EGO pem, vemkd, and egosc daemons.
This is different from previous LSF releases where the default LSF_LIM_PORT was 6879. LSF res, sbatchd, and mbatchd continue to use the default pre-Version 7.0 ports 6878, 6881, and 6882.
Upgrade installation preserves existing port settings for lim, res, sbatchd, and mbatchd. EGO pem, vemkd, and egosc use default EGO ports starting at 7870, if they do not conflict with existing lim, res, sbatchd, and mbatchd ports.
To check your port numbers, check the LSF_TOP/conf/lsf.conf file in each cluster. (LSF_TOP is the LSF installation directory. On UNIX, this is defined in the install.config file). Make sure you have identical settings in each cluster for the following parameters:
For resource sharing to work between clusters, the clusters should have common definitions of host types, host models, and resources. Each cluster finds this information in lsf.shared, so the best way to configure MultiCluster is to make sure lsf.shared is identical for each cluster. If you do not have a shared file system, replicate lsf.shared across all clusters.
To enable MultiCluster, define all participating clusters in the Cluster section of the LSF_TOP/conf/lsf.shared file.
For ClusterName, specify the name of each participating cluster. On UNIX, each cluster name is defined by LSF_CLUSTER_NAME in the install.config file.
For Servers, specify one or more candidate master hosts for the cluster (these are the first hosts listed in the Host section of lsf.cluster.cluster_name). A cluster will not participate in MultiCluster resource sharing unless its current master host is listed here.
In this example, hostA should be the master host of Cluster1 (the first host listed in lsf.cluster.cluster1 HOST section) with hostB as the backup, and hostD should be the master host of Cluster2. If the master host fails in Cluster1, MultiCluster will still work because the backup master is also listed here. However, if the master host fails in Cluster2, MultiCluster will not recognize any other host as the master, so Cluster2 will no longer participate in MultiCluster resource sharing.
When Platform EGO is enabled in the LSF cluster (LSF_ENABLE_EGO=Y), you also can set the several EGO parameters related to LIM, PIM, and ELIM in either lsf.conf or ego.conf.
All clusters must have the same value of EGO_PREDEFINED_RESOURCES in lsf.conf to enable the nprocs, ncores, and nthreads host resources in remote clusters to be usable.
See Administering Platform LSF for more information about configuring Platform LSF on EGO.