By Ian Lumb | October 16, 2013 | workload manager, Slurm, HPC cluster management, CMGUI, Linux, Job Scheduler, Cluster Management, HPC Cluster, Linux Cluster, Linux Cluster Management, Bright Cluster Manager, HPC job schedulers, login nodes, interactive sessions, access services, access control, PBS Pro, LSF, Grid Engine, openlava, SGE jobs, Sun Grid Engine, open source scheduler
Because the impact of unmanaged interactive sessions can be significant,[1] the concept of login nodes in Bright Cluster Manager was introduced in Part 1 of this series.[2] Although login nodes address many considerations relating to interactive use, they are designed to do so in a limited way. For example, in Part 1, the following consideration was outlined (emphasis added here):
"Usage constraints - As its name suggests, a login node presents as a user-facing access service. It does not offer other services, and this includes the ability to act as a compute node.[3] Because users will submit computational workloads from this login node, however, the login node will need to be incorporated as a `submit-only node’ within the context of the workload manager."
In other words, in designing and implementing a solution for interactive use based on login nodes, interactive execution of computational workloads was out of scope in Part 1 of this series.
Some users, however, will have a legitimate need for executing computational workloads interactively:
Motivated by the need to support the interactive execution of computational workloads, or interactive workloads, the second part in this series focuses on the use of WorkLoad Managers (WLMs) with Bright Cluster Manager. As the means for managing the interactive impact on the compute nodes of a cluster, Part 2 complements the introduction of login nodes that was the emphasis of Part 1.
Although the implementation specifics differ somewhat, all WLMs that interoperate with Bright Cluster Manager provide support for interactive workloads. In the simplest of cases, the desired or required need to execute computational workloads interactively is indicated through use of:
Appropriate use of job submission commands (specialized or not, with or without tailored options) is necessary for executing computational workloads interactively. This use, however, will not guarantee that these workloads will actually be scheduled for interactive execution. In other words, scheduling attributes that permit interactive execution must also be present. This sufficient condition, for executing interactive workloads, is implemented in most of the WLMs that interoperate with Bright Cluster Manager. To summarize, permission to execute computational workloads interactively is handled through:
Users can have legitimate needs for executing computational workloads interactively. Because login nodes are not designed to handle such workloads, there is a need to support interactivity using compute nodes managed by workload managers. Through use of appropriate submission parameters, and configuration of the workload manager, the execution of interactive workloads can be managed efficiently and effectively through Bright Cluster Manager.
Notes:
[1] Interactive sessions that are not controlled or regulated are termed unmanaged.
[2] In Part 1, a broader context for login nodes is provided. Interactive use is one of the considerations addressed by this user-facing access service.
[3] In some cases, however, a login node may need to serve as a proxy compute node. Using this proxy mechanism, login nodes accept and forward computational workloads to compute nodes for execution. This is the default configuration for Cray supercomputers, for example. In this case, workload is forwarded through use of API calls to the Cray Application Level Placement Scheduler (ALPS) application launch and schedule utility.
Acknowledgements: In addition to numerous discussions with his colleagues at Bright Computing, the author gratefully acknowledges the assistance of Scott Suchyta (Altair), Cameron Brunner (Univa Corp.), Bill Bryce (Univa Corp.), Bill McMillan (IBM Platform) and David Bigagli (SchedMD).