In case you missed it, we’d like to draw your attention to a webinar that we hosted last week, giving an insight into Bright Cluster Manager 8.2. Catch the replay here.
Robert Stober, Bright’s Director of Product Management, delivered the presentation, and it was certainly an action-packed session. From the first moment, questions from the delegates were flooding in; Robert was clearly sharing content that was very interesting to the audience.
In the webinar, Robert gave an overview of the latest features and functionality in Bright Cluster Manager 8.2. He highlighted specific platform enhancements, including how to provision and manage compute for the intelligent edge utilizing Bright Edge, as well as integration for Spark and Kubernetes, the latest additions to Bright OpenStack, and more.
Our new Bright Edge feature was of particular interest to the audience, sparking a flurry of questions from the delegates. Bright Edge allows you to deploy and manage compute resources in distributed locations as a single clustered infrastructure, from a single interface. Bright Edge is an attractive proposition to Bright customers as it offers up a number of compelling advantages; not least that it reduces the cost of managing distributed edge nodes and clusters and promotes standardization across locations, while also working in low bandwidth locations.
We were delighted that so many of our customers and partners joined us for the webinar. If you missed it, or want to share it with your colleagues, you can catch the replay, here.
We also thought it would be useful to share some of the Q&A discussion, to give you a flavor of the conversations that were initiated as a result of Robert’s slides.
If you have any questions about the latest features and functionality in Bright Cluster Manager 8.2, please do not hesitate to get in touch!
Q - Would the Edge Director just be the local head node on the remote cluster?
A - The Edge Director provides some of the functionality for the edge nodes that the head node provides in a traditional Bright cluster. It is the provisioning server for the edge nodes in the location, and it exports the /cm/shared directory to the edge nodes in the location.
Q - Could the entire cluster be managed from the Edge Directors as well?
A - Edge Directors can provision and manage nodes locally, but they cannot manage the broader cluster like the head node of a cluster does.
Q - Can Edge Director perform the jobs at remote locations?
A - Yes, the Edge Director can, and we expect often will, perform tasks like run workloads.
Q - Will /cm/shared will be replicated? What happens when the connection to the management node breaks?
A - /cm/shared will not be updated if the connection to the Edge Director is down. But when the connection is restored replication will continue.
Q - Is it possible to install Director and Edge on a virtual machine?
A - Yes
Q - Let’s say you have a standard central cluster and in setting up the head node, you selected Type 1 network. How do you move to type 3 to support edge node at a remote location?
A - The cm-edge-setup command performs all the necessary setup.
Q - What is the difference between designing using edge nodes over VPN and remote provisioning nodes over VPN? Why would I choose one over the other?
A - They are very similar. As you pointed out, you could "create" a similar environment just using a VPN. However, Bright Edge makes it easier. PXE booting the edge nodes could be an issue over VPN. We cache metrics data in an Edge configuration, so the data gets automatically sent to the head node when the connection is re-established. Provisioning is done from the Edge Director, so images are not sent across the VPN for every node. We also have tools to create the Edge software image. This image can be installed remotely. It is designed to automatically connect to the head node once up and running.
Q - Let’s say you have a standard central cluster and in setting up the head node (installing it), you selected a Type 1 Network. How do you move to type 3 to support edge node at a remote location? Or put another way, out of the 3 Network types (1, 2 or 3) do you need to have selected to support network communication from the main head node to the edge nodes?
A - It doesn't matter which network type you initially selected. The cm-edge-setup command will make the necessary changes.
Q - Can you also select not completed jobs? A user can kill his job before ending, and does not get a bill?
A - If the job starts and therefore generates cost, that cost will be attributed to the user who submitted the job.
Q - Question in licensing. There seems to be a requirement to have selected Workload Accounting when ordering Bright.
A - Workload Accounting and Reporting is currently included in (the cost of) Bright Cluster Manager. All current licenses should have it selected.
Q - Can you easily export job metrics to servers that have similar reporting and querying software?
A - The Bright CMDaemon is a Prometheus data producer so you can connect Grafana on it and see the metrics, or you can export the results of PromQL queries to CSV files.
Q - How would you identify them? Is there an alerting? Or do I have to parse the logs? (regarding the GPU example)
A - Bright Workload Accounting and Reporting provides a built-in report that can identify wasted resources. You do not have to parse the workload manager log files.
Q - Are there minimum or maximum cluster size limitations?
A - The minimum license is 4 nodes. We are comfortable supporting clusters of 30,000 nodes.
Q - I have two separate HPC clusters that are located in the same Data Center. Currently, one cluster is managed by Bright, but the other one is not. I would like them both managed by Bright. Would it make more sense to set up the non-Bright-managed cluster as a new Edge cluster of the Bright-managed cluster, or would it just make more sense to fold in this cluster into the Bright-managed cluster by adding in the additional nodes into the Bright-managed cluster, creating new queues, etc. (and NOT use Edge cluster)? What are the pros and cons of both approaches? Again, both clusters are currently located in the same Data Center.
A - You could use either, but if the clusters are in the same data center, you would not be utilizing or need the advantages of Bright Edge like the remote install ISO for the Edge Director.
Q - When did node installer change from RHEL?
A - Yes, those are two different activities. The node installer is a lightweight Linux OS that is not RHEL. Its purpose is to identify the node, partition its hard disk, copy the image (which can be RHEL) to the hard disk then pivot to the copied image (i.e., run init).
Q - Is it possible to upgrade a Bright cluster from 8.1 to 8.2 while users are online using the cluster, or would user downtime be required? If it is possible, how would one accomplish this? How easy would that be?
A - The upgrade procedure requires that you shut down the compute nodes. It is possible to perform the upgrade by shutting down the CMDaemons on all the compute nodes and using the upgrade script option to not upgrade the workload management system. If the workload management system master is running on a node external to the Bright cluster, then you would only need to shut down all the CMDaemons.
Q - Does BCM support Nvidia DGX servers with Ubuntu? Can BCM manage RedHat based CPU system and Ubuntu based GPU system at the same time?
A - Yes and Yes
Q - Could you explain the "integration/support" of Nvidia DGX systems?
A - Bright and NVIDIA support our mutual customers running Bright Cluster Manager on the DGX. The DGX nodes can be head nodes, compute nodes or both, and they can be combined with other non-DGX servers in the same cluster.