Whether it’s force of habit, or from actual need, users of HPC environments crave interactivity. Because the impact of unmanaged interactive sessions can be significant, there exists the potential for concern. Is it possible to meet users’ need for interactivity while managing the potential for impact? In this first part of a two-part series on managing interactive impact, attention focuses on the introduction of login nodes to a cluster managed by Bright Cluster Manager. The second part in this series will focus on use of workload managers as a complementary means for managing the interactive impact on the compute nodes of a cluster.
In the simplest case, clusters are comprised of a head node plus multiple compute nodes. As the names suggest, the head node manages all the resources within its control, whereas compute nodes are those resources focused on executing computational workloads. (In addition to compute nodes, and even in the simplest of cases, head nodes typically manage network, power-distribution and potentially other devices.) As the only dual-home node within a cluster, the head node provides services both internally (i.e., to the resources it manages such compute nodes, network switches, PDUs, etc.) and externally to the cluster. In this simplest of configurations, access is one of the key services provided by the head node for the entire cluster.
The purpose of this post is to introduce a solution for managing interactive impact. More specifically, the burden of delivering a user-facing access service, is shifted from the head node to another managed resource in the cluster. Typically identified as a login node, configuration considerations for this user-facing access service might include:
- Access control - By definition, a user-facing access service is externally available. Because this same service allows for access to resources within the cluster, measures of access control need to be factored into the design of login nodes. (Physical and/or virtual appliances like firewalls, plus ssh for secure, encrypted communications between two untrusted hosts over an insecure network, might factor into the implementation of such a login-node design.) With the appearance of login nodes, measures of access control already in place for the head node can be enhanced - e.g., only administrators will require ssh-mediated access to the head node.
- Interactive-use services - It may be, for example, a site’s preference to allow interactive use of GUI-based tools for code development or generic use (e.g., Web browsing). From availability of compilers, debuggers and profilers, to support for X11, the software image deemed the default for nodes within the cluster will require modification.
- Responsiveness - To ensure the responsiveness of the interactive-use service that is being provided, it may make sense to offer multiple login nodes. Using load-balancing schemes (e.g., a round-robin placement based on responses to DNS queries), users seeking access to the cluster can be distributed amongst all available login nodes.
- Usage constraints - As its name suggests, a login node presents as a user-facing access service. It does not offer other services, and this includes the ability to act as a compute node. Because users will submit computational workloads from this login node, however, the login node will need to be incorporated as a `submit-only node’ within the context of the workload manager.
Bright Cluster Manager allows login nodes to be created, subject to the configuration considerations outlined above. The process is as follows:
- Cloning the default image - In this step, a clone (i.e., an exact copy) is made of the default software image - i.e., the image used by default by the compute nodes in the cluster.
- Creating the login node category - In Bright, a node category is a group of regular nodes that share the same configuration. In this step, a new node category is assigned expressly for the login nodes.
- Modifying the cloned image - From user-facing applications (e.g., development and/or visualization tools) to enabling libraries and/or APIs (e.g., X11), modifications to the cloned image are made during this step.
- Assigning a submit-only role - In keeping with the considerations outlined previously, login nodes are no longer available to execute computational workloads. Thus, it is the purpose of this step to ensure that the workload manager accepts jobs submitted from login nodes, but that it does not attempt to execute jobs on these same nodes.
- Creating the login nodes themselves - Whereas the preceding steps focused on the login nodes as a whole, in this step each of the login nodes is created within Bright. Even if multiple login nodes need to be created, Bright makes this step extremely efficient and effective.
- Making the login nodes externally accessible - Unlike the compute nodes from which the software image was cloned, login nodes need to be externally accessible, in addition to being able to make use of the internal network. In this step, a network interface is configured to ensure external access.
- Enabling DNS services - With external access now properly configured, this step ensures that name-resolution services are provided by the enterprise or corporate DNS server, as opposed to the head node.
- Load balancing login nodes - To assist in delivering a responsive interactive use environment, a simple algorithm for load balancing users upon login can be implemented. In this step, a round-robin scheme based on DNS lookups can be employed to achieve this objective. Of course, more-sophisticated approaches can be introduced (at a later time) as necessary.
- Enabling access control for the login nodes - Physical and/or virtual appliances like firewalls, plus ssh for secure, encrypted communications between two untrusted hosts over an insecure network, might factor into the implementation of this step in the process.
- Hardening access control for the head node - Since users no longer require access to the head node, in this step, measures of increased security can be applied.
- Provisioning the login nodes - To ensure that all of the previous steps in this process are drawn into effect, in this penultimate step, the login nodes are rebooted to ensure that they are imaged and configured as designed.
- Testing the login nodes - Once the login nodes have been provisioned, they should be tested to ensure that they are functioning in practice as designed. Briefly, and as a user, tests should cover off the implementation of each of the design considerations.
Bright Cluster Manager includes a Command-Line Interface (CLI) as well as a GUI. It is noteworthy that each of the above steps can be applied through use of either the CMSH CLI or CMGUI.
The intent of the above was only to provide an overview of the steps involved in the creation of login nodes for a cluster managed by Bright Cluster Manager. The details of these steps can be found in the Bright Knowledge Base article “How do I add a login node?”. This article on login nodes is the second most popular FAQ in the Bright Cluster Manager Knowledge Base. (To view the other nine FAQs in this top 10 list, go to the FAQ Home area of The Bright Knowledge Base and look for the “Most popular FAQs” column on the right.) The Bright Knowledge Base is a substantial and authoritative resource for assistance in managing clusters via the Bright Cluster Manager. With numerous contributions from Bright users as well as Bright Computing, this Knowledge Base complements resources like the extensive and comprehensive Bright Administrator Manual.
In this first part of a two-part series on managing the impact of interactive use, attention focused on the introduction of login nodes to a cluster managed by Bright Cluster Manager. By introducing login nodes using Bright, the head node of a cluster is completely relieved of the burden of delivering a user-facing service for interactive use.