Project Rhino: Building a Layered Defense for Apache Hadoop

The ecosystem that’s assembling around Apache Hadoop* is starting to sound like a zoo, with Giraph graph analysis, Pig data manipulation, and Hive data warehousing. Now, along comes Project Rhino. This member of the menagerie promises to be the toughest of all, and that’s a good thing – because Project Rhino has a big job ahead of it: building a multi-level, coherent security envelope around Apache Hadoop.

The good news about Hadoop is that it enables organizations and enterprises to collect, store, process, and analyze a new range of vast, unstructured data sets.

However, the bad news is that Hadoop is not just one monolithic technology but a collection of individual components and technologies that operate together. How much does Apache Hadoop, as it comes from open source, partake of enterprise-grade data privacy and security? Not very much. To get Hadoop ready for the enterprise will require reinventing an enterprise-strength security infrastructure that protects all the facets of Hadoop. And that demands multiple layers of security— including firewalls, API gateway defenses, authentication, authorization, and data-protecting encryption —all working together, woven into the core texture of Hadoop itself.

This is Project Rhino: an ambitious open source initiative to interweave security into Hadoop from the inside out and the outside in. We are building the security capabilities that will allow Apache Hadoop to serve as an enterprise-grade operating environment for data processing and data analytics.

We won’t achieve all the goals of Project Rhino overnight, but we are making great progress. For instance, protecting data with traditional software encryption is computationally expensive, and can really drag down performance when encryption software confronts the vast amounts of data in stored in Hadoop. Intel has developed Intel® Data Protection Technology with Advanced Encryption Standard New Instructions (AES-NI), a hardware-based encryption technology that offloads cryptographic functions onto Intel® Xeon® processors to deliver near real-time encryption for massive data sets. Hardware-powered encryption allows Hadoop to protect sensitive information without incurring a performance penalty.

To learn more about how Intel AES-NI helps deliver low-overhead, enterprise-class data protection to Apache Hadoop, read this white paper. Consider attending the Project Rhino sessions at Intel Developer Forum (IDF) 2013 in San Francisco next month. Intel’s Bing Wang, security technology product manager, will demo and present new test results that showcase significant performance improvements for big data encryption using Intel AES-NI. For more information on Bing’s sessions, check out these links: Hands-on Lab: Security and Intel®Distribution for Apache Hadoop* Software, and Protect Your Big Data with Intel Xeon Processors and Intel Distribution for Apache Hadoop Software.

To learn more about the Intel Distribution for Apache Hadoop Software, go to Give the Intel Distribution a test drive: Download a free, 90-day trial and experience the power of hardware-assisted security and enterprise grade performance for Apache Hadoop big data processing.

Follow Tim and the Hadoop community at Intel at @TimIntel and @IntelHadoop.

Published on Categories Archive
Tim Allen

About Tim Allen

Tim is a strategic relationship manager for Intel driving enablement for enterprise software companies related to the cloud, big data, analytics, AEC, commercial VR, datacenter, and IoT. Tim has 20+ years of industry experience including work as a systems analyst, developer, system administrator, enterprise systems trainer, product marketing engineer, and marketing program manager. Prior to Intel Tim worked at IBM, Tektronix, Intersolv, Sequent and Con-Way Logistics. Tim holds a BSEE in computer engineering from BYU and an MBA in finance from the University of Portland. Specialties include - PMP, MCSE, CNA, HP-UX, AIX, Shell, Perl, C++