Evolving Architectures in Big Data: The Intel(r) Distribution for Apache Hadoop*

Raghu Sakleshpur is an engineering manager at Intel who works on Hadoop deployments and Big data technologies with partners, ISVs and customers. He is a technologist to the core and loves to share his experiences on Big data and Hadoop technologies whenever the opportunity presents itself. In his spare time, he loves pursuing his other passions like running, hiking, biking and watching sports.

The rapid rise and universal nature of the Big Data problem, present clear challenges to data centers to not only harness Exabyte scale data in real time but also analyze and profit from information harvested from the persisted data at near real time speeds. The Hadoop eco-system and specifically, Hadoop distributions such as the Intel(r) Distribution for Apache Hadoop* (IDH) are rapidly becoming the de-facto Big Data platforms of choice that provides the best alternatives in terms of scale and cost, to existing EDW, RDBMS and MPP technologies in tackling the challenges of big data.

As the adoption and usage of big data technology has become pervasive, the need to have effective tools for management of the Hadoop cluster and integration of the Hadoop platform to existing tool infrastructures that are specific to industry verticals and problem domains has also become quite apparent. 

For example, the challenges and requirements of handling scientific big data are much different from that of financial or social domain big data. A viable Hadoop distribution such as IDH will need to have the ability to support a wide range of architectures (out of the box) and should co-exist with existing technologies and platforms specific to problem domains transparently.

At Intel, we are working with customers, partners and the big data open source community to not only design and implement suitable architectures and tools for conventional bare metal based distributed systems for Hadoop, but are also working on transcending the Hadoop platform to distributed system architectures on shared and private virtualized infrastructures in the cloud.

Stay tuned at my blog articles for more information on such architectures and tools that are being worked on at Intel to address the challenges of big data for specific problem domains that is helping transform the Hadoop technology to the next level.

Please comment below on what evolving architecture patterns that you see evolving in supporting Big Data solutions and please check out hadoop.intel.com for more information on how IDH is adopting these architecture patterns.