Oracle Big Data Connectors – Integrating with IDH: Part #1

Ritu Kama is the Director of Product Management for Big Data at Intel. She has over 15 years of experience in building software solutions for enterprises. She  led Engineering, QA and Solution Delivery organizations within Datacenter Software Division for Security and Identity products. Last year she led the Product and Program management responsibilities for Intel’s Distribution of Hadoop and Big Data solutions.  Prior to joining Intel, she led technical and architecture teams at IBM and Ascom. She has a MBA degree from University of Chicago and a Bachelor’s degree in Computer Science.


One of the greatest challenges facing organizations working with Apache Hadoop is integrating data stored in existing infrastructure with data stored in HDFS.  Solving this problem is critical to long term success in data analytics. Today, we would like to announce that we are collaborating with the Oracle Big Data team to certify the compatibility between the Intel(r) Distribution for Apache Hadoop* (IDH) and Oracle Big Data Connectors (Oracle BDC)

Using Oracle BDCs with IDH will provide organizations with a complete solution to this data integration challenge.

Oracle Big Data Connectors connect data stored in HDFS with the rest of the Oracle ecosystem.  The total solution includes the following four products

• Oracle SQL Connector for HDFS

• Oracle Loader for Hadoop

• Oracle R Connector for Hadoop

• Oracle Data Integrator Application Adapter for Hadoop

Oracle SQL Connector for HDFS

Oracle SQL Connector for HDFS (OSCH) is a connector that allows for access to data stored in HDFS directly from Oracle.  The approach is easy and intuitive.  Using OSCH, an Oracle external table is created over data stored in HDFS or in Hive tables.  Once this external table is created, users can query this external table the same as they would any other table stored in Oracle Database

Oracle Loader for Hadoop

Oracle Loader for Hadoop (OLH) is used for optimized data loading from HDFS into Oracle Database.  Using OLH, MapReduce jobs can output directly into Oracle Database.  OLH will automatically take care of sorting, partitioning, and conversion of data into Oracle formats.  Using OLH will optimize performance by preprocessing the data on the Hadoop cluster to reduce resource usage on the database

Oracle R Connector for Hadoop

Oracle R Connector for Hadoop (ORCH) provides users of the statistical package R easy access into HDFS.  Using this seamless integration, R users are able to leverage MapReduce processing without needing to learn another language.  In fact, using ORCH, R users can work with data stored in HDFS without needing to understand the inner workings of Hadoop, MapReduce or cluster infrastructure.

Oracle Data Integrator Application Adapter for Hadoop

Oracle Data Integrator (ODI) Application Adapter for Hadoop provides Hadoop integration with ODI.  Using the ODI Application Adapter, users can create Hadoop specific metadata that will allow for the use of the ODI graphical user interface to work with data in Hive and HDFS.  Users can work with data on Hadoop using this GUI, then load the data directly into Oracle Database using OLH or OSCH when finished processing

We here at Intel are extremely excited to announce this collaboration with Oracle at Oracle Open World in San Francisco.  I hope that you are just as excited to work with this technology.