Improving a Heterogeneous Cluster Environment: A Case Study


Capture-3The University of Maryland, Baltimore County (UMBC) was able to simplify and shorten the deployment process for its High Performance Computing Facility (HPCF), which includes three different generations of hardware and a mix of traditional CPU nodes, nodes with GPUs, and nodes with Intel Phi Series co-processors.

More than 80 project are using or have used the HPC for research for numerical simulations, statistical comparison, computational models, and atmospheric remote sensing. Research includes investigations on HPC itself, with one investigator running sample scripts and custom code to determine the most effective ways to make use of cluster computers. Other examples include numerical simulation of calcium waves in human heart cells, and even a visualization of early Washington, DC.

The multiple generations of hardware led to different InfiniBand cards, two different InfiniBand speeds and two different InfiniBand switches within the cluster. UMBC cluster administrators were concerned that the heterogeneous hardware environment was complicating its use. They wanted to be able to work with researchers on their projects on the cluster, and not spend all of their time performing baseline maintenance.

After implementing Bright Cluster Manager in their latest HPC clusters, administrators report that cluster management is easier and installing and updating nodes is greatly simplified.

From this study, you can learn:

  • How BCM reduces time-consuming baseline maintenance and time spent recompiling libraries.
  • How cluster troubleshooting support means admin staff can concentrate on integrating hardware, work with researchers to set scheduling, and tailor jobs to research needs.
  • How Bright’s test cluster helped UMBC avoid errors and fix problems more quickly.

Read the entire case study here.

High Performance Computing eBook