Hadoop gains traction in the enterprise

Email LinkedIn
Tools

Enterprises have begun embracing open source storage and analysis system Hadoop even though technology issues and a host of questions related to data security and management remain. By enabling businesses to manage vast amounts of structured and unstructured data more affordably than relational database systems can, Hadoop is gaining traction beyond Web 2.0 companies like Google (NASDAQ: GOOG), Yahoo and eBay, reports Jaikumar Vijayan at ComputerWorld.

JPMorgan Chase (NYSE: JPM) has been using Hadoop for nearly three years in a growing number of functions, including fraud detection, IT risk management and self service, the company's managing director Larry Feinsmith said this week at the Hadoop World conference in New York. With Hadoop, Chase has been able to gather and store huge amounts of unstructured data from social media, blogs and transactions. Bringing all of the disparate data into one platform, the company can apply data mining and analytics tools on the information. The question right now for IT pros at Chase is whether Hadoop-based technology can one day be used for processing transactions as well. 

There are a number of challenges enterprises should consider when implementing Hadoop, however, Feinsmith cautions. Products, standards and vendors seem to be in flux, making for a "very confusing marketplace," he said. There are also integration challenges and relatively few engineers with Hadoop expertise. What's more, the ability to aggregate such massive quantities of data, in itself, leads to a number of concerns around security, access, availability and business continuity.

In a separate article, Vijayan takes a look at the security issues that are raised in deploying Hadoop. Data access concerns have led some federal agencies to keep sensitive data out of Hadoop databases. Other users have turned to encryption to protect data stored in a Hadoop environment.

The jumble of disparate data that Hadoop can aggregate can mean mixing together information of varying security sensitivity, cautions Richard Clayton, a software engineer with Berico Technologies. Enterprises have to put the right controls in place to ensure role-based access.  

Having so much data in one place also raises the risk of inadvertent disclosure and theft.  If analytical tools create new datasets with sensitive information, those sets have to be protected as well. One way that government agencies are protecting Hadoop-stored information is by walling it off in "enclaves" that only cleared personnel can access.

Despite the technology's challenges, enterprises are vying for IT pros with Hadoop skills, reports Doug Henschen at InformationWeek. JPMorgan Chase, eBay and Cloudera were all in recruiting mode at the Hadoop World conference. Henschen notes that Hadoop World has tripled in size since the 2009 conference, and the technology has been embraced by IBM (NYSE: IBM), Microsoft (NASDAQ: MSFT)and Oracle (NASDAQ: ORCL) this year. 

For more:
- see Jaikumar Vijayan's article at ComputerWorld
- see Jaikumar Vijayan's article on Hadoop security issues at ComputerWorld
- see Doug Henschen's article at InformationWeek

Related Articles:
How Microsoft learned to accept open source
Consider open source when deploying BI