The Biggest Takeaways from the Strata+Hadoop Summit

    

When it comes to public speaking, the president of the United States is a tough act to follow.

D.J. Patil found himself in that daunting position at the recent Strata+Hadoop Summit in San Jose, Calif. Patil is the freshly appointed (and first ever) chief data scientist for the White House. He had to give a keynote at the event right after his new boss, President Barack Obama, introduced him via a recorded video message. What speaker wouldn’t feel pressure to hold an audience’s attention after that?stratahadoop

At Bright Computing, we thought Patil was able to hold his own at the podium and we’ve curated some of the highlights from his speech, and other conference sessions of interest, just for our readers.

Hail to the chief (data scientist)

Patil took the stage to laud Obama as “the most data-savvy president ever.” Patil also called the U.S. government “more data-driven than most companies are right now.” He laid out a pretty convincing argument by listing the ways the U.S. feds are approaching their data strategy. Under a new executive order from Obama, all government-created documents must be “openable and machine readable.”

Patil noted that his office’s mission is to “responsibly unleash the power of data for the benefit of the American public and maximize the nation’s return on its investment in data.”

To achieve its data goals, Patil said the government must stay focused on one question: “In a useful way, how do we build an ecosystem of things that really are data products and that add value?” He said the government must “start building those products that really showcase the value proposition (of data) and not just about opening the data.” ROI. Product-focused. Value-added. Not a bad data model for enterprise, perhaps.

Open source (of tension)

A divergence (or was it a schism?) within the Hadoop community played out during separate presentations at the summit. Roman Shaposhnik, senior manager of open source Hadoop platform at Pivotal, made a case for the new Open Data Platform Alliance (ODPA) recently formed by players including GE, IBM, Splunk, and his own firm, Pivotal Inc.

Although he praised the openness of the Apache Software Foundation (ASF), Shaposhnik said, “There’s one thing the ASF doesn’t really solve: how do you actually make a product out of these ecosystem projects?”

He touted the ODPA as a potential solution. “What we’re proposing is that the members of this Open Data Platform Alliance actually would be shipping the bits themselves. It’s not about testing, it’s not about compatibility. It’s about grabbing the bits and making them part of the distribution of Hadoop that you’re producing. And I think that’s a very fundamental shift in perspective.”

CTO Amr Awadallah of Cloudera (a Bright partner) begged to disagree. He argued during his session that there’s no need for another open source Hadoop organization.

“The ethos of Apache is that you join the consortium. You join by contributing code, by contributing to the platform, and by creating new innovations, which is the right way,” Awadallah said. It’s a debate that will no doubt continue as ODPA gets off the ground.

The culture club

Pivotal’s cloud evangelist, Stacey Schneider, was at Strata+Hadoop to support her colleague Shaposhnik. But talking to other attendees, she kept hearing one common complaint: the frustration of trying to change the culture within their own enterprise organizations.

“It’s so much bigger than Hadoop,” Schneider said in a phone interview. “The technology is there but the secret is getting someone else to walk through the culture change with them.”

As Schneider explained in her own blog, today’s IT managers have a tough time convincing their organizations they must take a more agile, innovative, and open approach in order to optimize today’s technology. She told me it requires changing their entire culture, from how you do requirements to how you push out code. It’s ground up innovation and they have to chaperone that through their organizations. But it takes people a while to get the benefits of that.

Schneider believes that although it costs money (and maybe even downtime) at the start, it’s a culture shift worth making in the long run. That may wind up being one of the most important takeaways from the conference, and for anyone looking to get business value out of Hadoop.

###

Christine Wong has written for ITBusiness.ca, CanadianCIO, and a number of other technology publications. She lives in Toronto. 

hadoop