0-Day Support for the NVIDIA® Tesla® K80 Dual-GPU Accelerator


By Ian Lumb | November 17, 2014 | GPU, HPC, HPC Cluster, Bright Cluster Manager, GPGPU management, K80, CUDA 6.5




The NVIDIA® Tesla® K80 Dual-GPU Accelerator was announced at 6 am PST today at SC14 in New Orleans. Bright Computing announced support for the K80 an hour later. How did we deliver 0-day support for the K80? Easy: People. Product. Process. We’ll briefly cover The 3Ps here, so you’ll understand why your K80s will be ready for production use upon arrival. 


First, it’s about people. Bright Computing and NVIDIA maintain business and technical relationships. Take the K80 for example. Before NVIDIA’s announcement hit the wires today, Bright Computing had a K80 in the hands of its engineers. Our engineers put the K80 through a slew of integration tests, and declared support in time for SC14.

Of course, new accelerators require software to be empowered for GPU-based HPC. And that’s where NVIDIA’s early access program for the CUDA toolkit comes in. Because Bright Computing gets involved with release candidates for the toolkit, we’re able to deliver support  within days for the production release. Current software is critical from a hardware-support perspective, as accelerators like the K80 typically require the latest in CUDA tooling for use.


Second, 0-day support for the K80 is about product. In Bright Computing’s case, product equates to software. Bright Cluster Manager supports CUDA 6.5, ergo we support the K80. Using version 340.32 of the CUDA driver, Bright exposes detailed status information for the K80 in the screenshot using our cluster-management GUI below. Bright’s ability to monitor the K80, and set its clock speed (via direct access using the NVIDIA GPU Boost™ technology), is illustrated in screenshots that accompany our news release



Finally, 0-day support for the K80 is about process. Bright Cluster Manager has a well-established process for propagating software updates. We make use of YUM, and provide you with updates on a regular basis. At the end of September, we released update 7.0-20 - our 20th update to version Bright 7.0. In this update to Bright, we included support for CUDA 6.5.

Because Bright Cluster Manager stays current with CUDA, our engineers were able to commence integration testing upon receipt of the K80 from NVIDIA. By keeping current, our engineering effort is optimized. More importantly, this means you can place the K80 into your production HPC environment, and obtain results immediately.

Day 0 and Counting

Our news release ends with the following statement:

“The Tesla K80 dual-GPU is the new flagship offering of the Tesla Accelerated Computing Platform, the leading platform for discovery and insight at scale, providing hardware, software and an extensive supported ecosystem for GPU-accelerated applications in the data center. It delivers nearly two times higher performance and double the memory bandwidth of its predecessor, and 10 times higher performance than today’s fastest CPU on hundreds of applications.”

Enabled by Bright Cluster Manager and CUDA 6.5, the K80 delivers the state-of-the-art platform for GPU-based HPC. As of Day 0, we’re ready to demo and discuss Bright-managed K80s starting this week in Booth 2615 at SC14 in New Orleans. We’ll follow up in two weeks with a Bright Topics webinar that reviews our SC14 product releases and previews. Then, early in the New Year, we’ll be letting developers know why managed GPUs matter via NVIDIA’s GTC Express.

We look forward to hearing of your experiences with the K80, and to the results you will obtain.

High Performance Computing eBook