Does your Hadoop cluster need a cluster manager? It just might. Let’s see why.
Every day we hear about more organizations implementing Hadoop as part of their IT infrastructure. Most people think of Hadoop installation as something that starts on top of an existing, working cluster. But how did that cluster get there? And once everything is up and running somebody has to keep the thing up and running in order to keeping getting value out of it.
With a small cluster, that’s not such a big deal. As clusters get bigger, though, the work involved in keeping it healthy grows…a lot.
Will those scripts you crafted so carefully scale as the cluster grows to hundreds, or even thousands of nodes? Will the next sysadmin that comes in — after you get promoted for your great work — be able to do things just the way you planned? Every time?
Maybe. Maybe not. So what’s a responsible sysadmin to do? Use a cluster manager!
An enterprise-grade cluster manager will make every aspect of your interaction with your new Hadoop cluster better.
While it’s not too hard to build a cluster to install Hadoop in the lab, things can get tricky when you take your lab project into production. You may not always be able to count on the sysadmin responsible for it being completely intimate with every detail. In trying to solve the various issues that inevitably come up during a deployment, the sysadmin on site may make changes to the scripts that, well to put it bluntly, break things.
And even if that doesn’t happen, there’s still the matter of time. Manually installing and configuring all the parameters necessary to build a functioning cluster — and then installing the Hadoop software on top of that — can take time. A lot of time. The bigger the cluster, the more time it will take. Possibly more time than the data scientists waiting for their new toy want to wait.
Cluster managers automate the installation and configuration process, not only of the Hadoop software, but also of the operating system software, networking software, hardware parameters, disk formatting, and dozens of other little things that need to be attended to.
Keeping an eye on things
But maybe you’ve got a crack installation and deployment team, and have every confidence they can setup and configure clusters of any size anywhere you need them, and do it in a hurry. Fair enough. There still may be some real benefit to using an enterprise grade cluster manager. After all, just because the cluster is up and running doesn’t mean it’s going to stay that way. How shall I put this — stuff happens.
Drives fail. Memory acts flakey. Servers overheat. Configurations get altered — somehow — even though you have strict protocols in place to prevent that sort of thing.
When your were testing and evaluating your Hadoop cluster in the lab, you had full control over it. And there wasn’t a team of data scientists counting on it being there whenever they need it. Now that you’re in production mode, you no longer have the luxury of waiting for things to fail, and bringing nodes back up when you have time. No, now you have to keep things humming 24x7x365.
To make sure that happens, you’ll want to treat your Hadoop cluster like any other critical service in the data center — professional grade management tools.
What can a real cluster management tell you that other tools can’t? How about telling you the temperature range of all the servers in the cluster over time? Or warnings of instability cropping up in one of the cluster’s hard drives? That’s the kind of information a cluster manager can provide because of its intimate knowledge of the hardware and software at the heart of things. It’s the kind of data you can use to keep systems running longer with fewer service interruptions to your users.
“That’s all very nice,” I hear you saying, “but what about my data. How can a cluster manager help me keep an eye on the file system”?
While a properly configured Hadoop cluster does a great job of protecting data through replication across multiple nodes, the system can suffer a performance hit when that protection gets called into play. So it’s a good idea to keep an eye on the file system itself to make sure it’s doing what it should be doing. Is it spreading the load evenly amongst available resources? Is the replication factor high enough? Are some of the drives reaching capacity? Sure would be nice to have an easy way to see that at a glance. Not surprisingly, your enterprise grade cluster manager can help you out here too.
Modern enterprise cluster managers provide dozens of other monitoring capabilities and health checks that can be invaluable to anyone responsible with maintaining a Hadoop cluster, and I’ll cover some more of them in future posts.
For now, I want to cover one more area that cluster managers can help Hadoop cluster administrators with — changes.
It’s inevitable that your cluster will need changes once you put it into production. For example you may need to add, replace, or remove nodes. You may need to upgrade or replace operating systems, or other critical software across a large number of nodes. Cluster managers exist to make such things easy to do. They provide administrators with a user-friendly interface where they can do things like set parameters, select images, and provide other necessary input, then the cluster manager reaches into the cluster and makes the necessary changes — automatically. Nothing could be easier.
Well there’s more to say on the topic, but I hope what I’ve talked about here has you thinking about ways an enterprise grade cluster manager could help you deploy and manage your own Hadoop clusters.
Drop me a line if you have any questions, or tell us your experience in managing Hadoop clusters — with or without a cluster manager. I’d love to hear you stories.