How to Upgrade Your Hadoop Stack in 1 Step—With Zero Downtime


By Ian Lumb | October 30, 2014 | Bright Cluster Manager, Hadoop, Hadoop Cluster Management, Apache Hadoop, Apache Spark, Hadoop Analytics Stack, Software Maintenance



hadoop upgrade

Pop quiz: How many steps does it take to upgrade your Hadoop distribution?

Choose one answer:

  1. 1 step
  2. 5 steps
  3. 10 steps
  4. More than 10 steps
  5. None of the above - Hadoop distributions cannot be upgraded!

If you chose 1 step, you must be using Bright Cluster Manager for Apache Hadoop, as all other approaches require multiple steps. Need the details? (You will at some point!) Your options are summarized below. 

Multi-Step Upgrades

The Apache Project details a 4-step rolling upgrade process:

  1. Prepare the rolling upgrade
  2. Upgrade active and standby NameNode services
  3. Upgrade DataNodes
  4. Finalize the rolling upgrade

In preparing for the upgrade (Step 1), a snapshot of the Hadoop filesystem (HDFS) metadata is made for downgrade or rollback purposes (if required). It’s important to keep in mind that this is HDFS metadata (i.e., data about the data you’ve stored in HDFS, but not the data itself). Because replication is built into Hadoop, redundant copies of your data are your failsafe against data loss during the upgrade.

Steps 2 and 3 of this process make it clear that upgrading Hadoop corresponds to upgrading HDFS services. In the case of Highly Available (HA) configurations, the standby NameNode (NN2) is upgraded to the latest software release first in Step 2; when re-instantiated, the standby assumes the active role and ingests HDFS metadata. The same upgrade process is then applied to the node that was active prior to the start of the upgrade (NN1). The upgrade process for DataNodes is less choreographed, with subsets of nodes being upgraded simultaneously. The upgrade process with the DataNodes (Step 3) is repeated until all of the nodes have been upgraded to the latest release of the software.

Once complete, committing to the upgraded release of the software is achieved through the final step (Step 4).

If your cluster has not been configured for HA, the upgrade process is more involved, and downtime is inevitable. If federated clusters have been configured, the process needs to be repeated for each namespace.

The upgrade process ignores the JournalNodes and ZooKeeperNodes in your Hadoop cluster. The Apache Project argues that transaction logging “... [JournalNodes are] relatively stable and [do] not require upgrade when upgrading HDFS in most of the cases.” If, however, JournalNodes and ZooKeeperNodes do require upgrade, downtime may be involved.

Multi-Step Upgrades and the Hadoop Application Stack

Other than mentioning the ZooKeeper coordination service in passing, the Apache Project places emphasis squarely on HDFS-related services in the Hadoop upgrade process. Of course, your platform for Big Data Analytics involves more than just HDFS. Your deployment likely relies upon YARN for managing workloads, as well as a stack of analytics applications, in addition to HDFS and ZooKeeper. When you factor in the rest of the stack, the number of steps more than doubles in the cases of CDH and HDP upgrades. In fact, when you factor in the CDH 5 components, you’re looking at about 20 additional upgrades.

Cloudera Manager automates aspects of the CDH upgrade, but still requires a number of steps - some of which need to be taken manually.

HDP relies on Apache Ambari for deploying, managing and monitoring clusters. Today, upgrading equates to redeploying HDP in its entirety. Note: The consequences of a multistep redeployment process need to be carefully considered before any action is taken. Automated stack upgrades are a planned enhancement for a future release of Ambari.

1-Step Upgrades for the Entire Hadoop Stack

A Bright Cluster Manager role is a task that can be performed by a node in your cluster. Take for example a Bright-managed Hadoop cluster configured with NameNode HA (and Automatic Failover): Nodes are assigned DataNode, NameNode, JournalNode and ZooKeeperNode roles - see the screenshot below. Bright roles make relationships explicit, so dependencies between:

  • Different instances of the same service are captured. This is crucial in defining services as being highly available through redundancy. For example multiple instances, on different nodes, of NameNode, JournalNode and ZooKeeperNode roles ensures that these HDFS services are made redundant.
  • Different HDFS services are captured. Even as the application stack fills in, object-based Bright roles maintain clarity on the relationship hierarchy for the multiplicity of Hadoop services. For example, YARN’s prerequisite for functioning HDFS and HDFS services is made known.

Hadoop upgrade

Because Bright Cluster Manager roles allow Hadoop services to be defined, assigned and composed, the Apache Project’s upgrade procedure can be collapsed into a single script. Bright’s 1-step upgrade script also incorporates the following enhancements:

  • Automated deployment of updated software that ensures configured instances of Hadoop are updated. (Bright allows you to manage multiple instances of Hadoop from a single Bright GUI. You can choose to manage different versions of the same distribution, or completely different distributions of Apache Hadoop.)
  • DataNodes are upgraded simultaneously. Because provisioning in parallel has been a Bright core competence for many versions now, this is a highly mature capability. In fact, in very large deployments (i.e., thousands of DataNodes), you may want to take advantage of Bright’s distributed provisioning capability to significantly reduce the time required for the upgrade.
  • JournalNodes are upgraded without downtime.
  • Automated testing of the upgrade prior to commitment. Bright enables the upgraded NameNode(s) and DataNodes so that various tests can be conducted before a commitment to the upgraded deployment is made. Upon successful completion of the tests, the upgrade of HDFS services can be finalized.

Bright Cluster Manager for Apache Hadoop has been validated for various rolling-upgrade scenarios - see the table below for the details. Particularly noteworthy is the cascading upgrade of CDH: In two steps with Bright you can upgrade from CDH 5.0.4 to CDH 5.1.3, and then immediately from CDH 5.1.3 to CDH 5.2.0. Our upgrade process will even allow you to revert to the pre-upgrade state, should you need to.

Distribution Pre-Upgrade Version Intermediate Version Post-Upgrade Version
Apache Hadoop 2.4.1   2.5.1
CDH (based upon Apache Hadoop 2.3.0) 5.0.4 5.1.3 5.2.0
HDP (based upon Apache Hadoop 2.4.0)

So much for HDFS and its services, what about the rest of the stack - i.e., components like YARN and the analytics applications? Bright Cluster Manager for Apache Hadoop maintains YARN, Apache Spark and other components. Maintains is the operative word here. Bright Computing ensures that updates to HDFS, YARN, Spark and other components are made available via YUM updates to our product on a regular basis. Almost more importantly, Bright Computing ensures compatibility of components on an ongoing basis. Because maintaining the stack of Hadoop software is what we do, your distro-upgrade process is greatly simplified. Translation: No extra steps are required to maintain your Hadoop stack.

Bright Computing customers won’t find any of this surprising, as we’ve earned a solid reputation for easing the burden of management outside the Big Data Analytics arena - most notably, in High Performance Computing (HPC) and Cloud computing. Because we have over a decade’s worth of experience in managing complex IT infrastructures, Bright Cluster Manager is a mature and robust solution that saves you time and effort.

Interested in learning more? Please join us next week for a webinar on this topic. 

Do you have an upgrade scenario in mind? If so, please get in touch with us. We have the product, people and process to rapidly execute pain-free Hadoop upgrades without downtime.