Bright Cluster Manager for Apache Hadoop: The First 6 Months of a Radiant Relationship


By Ian Lumb | September 26, 2014 | Linux Cluster Management, Bright Cluster Manager, Big Data, Hadoop, Hadoop Cluster Management, High Performance Data Analysis, Bare Metal Provisioning, Apache Hadoop, Monitoring, Apache Spark




Last week we released support for Apache Hadoop 2.5.0. I thought it would be a good time to revisit all the updates we’ve made over the last 6 months - in case you missed them.

Latest Updates

Our latest updates include:

  • Apache Hadoop 2.5.1  Typo? Not exactly. We introduced support for Apache Hadoop 2.5.0 last week. Since Version 2.5.1 (released on September 13, 2014) is a minor release that builds upon the stable 2.4.1 release, you can now expect support via YUM updates to Bright. Note that vanilla Apache Hadoop includes HDFS (the Hadoop Distributed File System), ZooKeeper (the coordination service) as well as YARN (the workload manager).
  • Cloudera CDH 5.1.0  We introduced support for CDH 5.1.0 in mid-July.
  • Hortonworks HDP 2.1  We introduced support for Hortonworks Data Platform (HDP) 2.1 in mid-May.

Bottom line: Bright Cluster Manager maintains support for all major distributions of Apache Hadoop. Because we’re obsessed with keeping you current, you don’t have to be. We free you up to focus on the analysis of Big Data - and wasn’t that why you got interested in Hadoop in the first place?

Two more points on distros before we move on to other updates:

  • Bright Cluster Manager doesn’t require you to choose between distros. You can even stand up multiple distros in parallel. Our customers use this cool capability to run their proof of concept projects, and perform major migrations between distros.
  • Apache Hadoop evolves rapidly, but enterprise customers like to manage change cautiously. The challenge is to derive the benefits of innovation and mitigate the risks of change. We have a lot to say about rolling updates, but we’ll save that for another post …


Distros supply the essentials. To wrangle problems in Big Data Analysis, however, we know you need more than just HDFS, YARN and ZooKeeper. So, we’ve made a number of updates over the past 6 months:

  • Apache HBase  “Apache HBase is the Hadoop database,” asserts the project’s Web site. It's “... a distributed, scalable, big data store ...” that has been in our offering since the outset, and has only been updated once in a minor way.
  • Apache Pig  We introduced a deployment capability for Apache Pig in the June timeframe. Pig is a high-level scripting language akin to SQL that allows you to craft processing instructions for Big Data Analytics workloads (and workflows) via MapReduce and YARN.
  • Apache Hive  We introduced a deployment capability for Apache Hive via YUM updates to Bright. Hive is data-warehouse software. Hive makes use of HDFS to store its data, and supports queries via the SQL-like HiveQL language.
  • Apache Accumulo  We introduced a deployment capability for Apache Accumulo via YUM updates to Bright. Accumulo is a highly scalable structured store based on Google's BigTable - “a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers.” Accumulo stores data in HDFS using a model that is advanced beyond key-value stores. It is an alternative to Apache HBase.
  • Apache Spark  We introduced a deployment capability via YUM updates to Bright. Apache Spark aspires to be a fast, general-purpose engine for Big Data Analysis. With the recent release of Version 1.1, Spark’s progress is evident. Spark has ignited our interest and yours; we have more to say, but that’ll have to wait for another time …

Choose a distro. Add one or more extras. Voilà! Your platform for Big Data Analytics awaits you! Because Bright Cluster Manager for Apache Hadoop bundles distros as well as extras, your complete platform is available in minutes. You don’t need to visit individual Web sites for each of the extras that interest you. And, of course, we’ll keep you current. For example, this means you’ll have access to the latest enhancements in Apache Spark soon after they become available.

Bright Cluster Manager is change-friendly. You can always add an extra, or even try an extra with a different distro. Although it’s pretty complete, our list of extras will continue to grow. In the interim, you can manually add any extra you feel is missing. And if you do, please tell us about it!

There’s More!

Managed software is about more than just pre-built packages. Over the past 6 months we’ve also made these updates:

  • Fixes and Improvements  About seven improvements and four fixes have been applied over the past 6 months. Detailed in our release notes, these enhancements are specific to Bright Cluster Manager for Apache Hadoop.
  • Deployment Manual  We update the deployment manual with new information continually. This manual introduces Hadoop and various extras, instructs you on installation, and shows you how to directly manage HDFS and other Hadoop services via the Bright Cluster Manager CLI or GUI. We even show you how to run Big Data workloads. Our deployment manual complements existing manuals that focus on installation, administration and use of Bright, as well as another new guide that targets developers directly.
  • Technical Support  We take considerable pride in delivering exceptional technical support. And according to our customers, we deliver. Our technical support team works closely with our engineering team to ensure that they have also kept pace with the past 6 months of updates.

The Elephant is in the Room

Given all the updates it has received, Bright Cluster Manager for Apache Hadoop is anything but a Version 1.0 product release.

It is built on a proven, enterprise-grade management solution that provides provisioning, monitoring and management capabilities that are unique in their ability to transform bare-metal hardware into a platform for Big Data Analytics.

Bright Cluster Manager provides a platform for Big Data Analytics as well as HPC. Owing to its industry leading presence in HPC established over more than a decade, Bright Computing solutions are uniquely positioned for the ongoing convergence of HPC and Big Data Analytics. Once again, this means that you do not need to choose between HPC and Big Data Analytics. In fact, with Bright, you can already extract value from the convergence of these domains.

From the distros to the extras, to value-added services, Bright Cluster Manager for Apache Hadoop has progressed with agility uncommon to members of the family Elephantidae. For your next Big Data project, why not take Bright for a ride?