Bright Cluster Manager leverages the latest NVIDIA Tesla™ V100 GPUs based on the new "Volta" architecture to offer administrators and owners of GPU clusters maximum insight and control.
Bright Cluster Manager can sample and monitor metrics from supported GPU cards and GPU Computing Systems, such as the NVIDIA Tesla V100 and Tesla P100. Examples of supported metrics include GPU temperatures, GPU exclusivity modes, GPU fan speeds, system fan speeds, PSU voltages and currents, system LED states, and GPU ECC memory statistics.
The frequency of metric sampling is fully configurable and so is the consolidation of the metrics data over time. Metrics data is stored in Bright Cluster Manager's central SQL database and can be visualized in value/time graphs, as well as in Bright Cluster Manager's unique Rackview. Bright Cluster Manager leverages NVIDIA’s Data Center GPU Manager (DCGM) for GPU health monitoring, diagnostics and validation, beginning with Version 8.0.
In 2019, NVIDIA acquired Mellanox, uniting two of the world’s leading companies in high performance computing (HPC). Together, NVIDIA’s computing platform and Mellanox’s interconnects power over 250 of the world’s TOP500 supercomputers and have as customers every major cloud service provider and computer maker.
Mellanox Technologies InfiniBand switches are fully integrated with Bright OpenStack, giving HPC customers the ability to create virtual machines (VMs) inside an OpenStack deployment with InfiniBand devices attached to them.The Mellanox InfiniBand integration is ideal for remote direct memory access (RDMA) use cases for VMs. The native IB connectivity provides a single IB networking fabric that spans both the physical and virtual environments. With this integration, VMs get direct access to the IB fabric, offering key advantages like low latency and high overall performance.
"There are now more than a 1000 NVIDIA GPU-based clusters around the world. Bright Computing's cluster management software fills a critical need for datacenter managers to reliably monitor and manage the status of their GPU-enabled clusters."
-Andy Keane, General Manager of the Tesla Business at NVIDIA
NVIDIA is the pioneer of GPU-accelerated computing. They specialize in products and platforms for the large, growing markets of gaming, professional visualization, data center, deep learning, and automotive.
For more information, visit nvidia.com