Why Would Intel Invest in Open Source Analytics Projects?


Although some people in the data analytics community may think of Intel as a company focused solely on the silicon at the bottom of the stack, the reality is quite different. As I noted in a March blog post, data analytics investments and innovation from Intel are present in all layers of the analytics solution stack—from infrastructure to applications.

In this post, I want to zero in on our investments in open source projects that are driving data analytics forward. There’s a lot to talk about here, because Intel invests in a wide range of open source initiatives for big data analytics, including those driven by Intel and those driven by our partners and industry groups.

In all of these areas, Intel collaborates with leading software developers and open source communities to provide choice and flexibility in solutions that meet diverse customer needs. These efforts underscore our support for openness and the principles of collaboration and pace of innovation that go with it. Individual products alone won’t create the analytics solutions that enterprises need today, so we’re helping to make open data and analytics platforms a reality—across the analytics solutions stack.

Let’s consider a few examples. Intel is a leading contributor to open source communities such as Apache Hadoop* and Spark*. For this work, Intel partners with vendors like Cloudera, a leading distributor of Apache Hadoop, to solve some of the industry’s most challenging problems. These collaborations focus on improving performance, security, and manageability, as well as supporting and participating in the ecosystem of developers, partners, and enterprises who build analytics applications.

In another open source initiative, Intel launched the Trusted Analytics Platform (TAP). Optimized for performance and security, TAP is designed to accelerate the creation of applications driven by big data analytics by providing a collaborative, flexible environment for advanced analytics in both public and on-premises clouds. Our goal with this project is to enable organizations to quickly and easily ramp their ability to deliver value in applications powered by analytics.

Intel is also deeply committed to working with the open source community to accelerate the progress of artificial intelligence and broaden access to powerful tools. Since most analytics workloads, including those in the realm of machine learning, are run on Intel architecture, we want to ensure that applications can take full advantage of the performance and security features of the hardware foundation.

To that end, Intel is enabling a diverse set of use cases by optimizing industry-standard frameworks (such as Caffe*, Theano*, and Spark*) for Intel® architecture. This allows organizations to use their existing IA-based infrastructure to deliver high performance AI at a low total cost of ownership. For example, companies using the optimized version of Caffe are now able to realize up to a 30-times increase in performance compared to a mainstream version running on Intel architecture.[1] In addition to Caffe and Theano, Intel will also optimize all other major deep learning frameworks for Intel architecture by the end of 2016.

To help developers achieve their optimization objectives for their machine learning applications, Intel offers software libraries like the Intel® Math Kernel Library (Intel® MKL), which accelerates math processing routines, and the Intel® Data Analytics Acceleration Libraries (Intel® DAAL), which help speed big data analytics with optimized algorithmic building blocks for all data analysis stages. These libraries enable software developers to build improved analytics applications more easily than ever before.

When the talk turns to these and other Intel open source investments, people often ask me, “Why do you do this? What’s in it for Intel?” The answer here is pretty simple. In our support for collaborative open source projects, we are helping to create the interoperable standards that accelerate market adoption of new solutions. That’s good for Intel, that’s good for our partners, and that good for the enterprises that count on all of us in the tech industry to bring them innovative solution that they can plug into their heterogeneous data centers.

Looking ahead, you will see Intel continue to invest in open source projects for big data analytics, including those launched from within Intel and those initiated by outside parties. We know that individual products created exclusively by one company won’t create the analytics solutions that data-driven enterprises need to stay competitive. The analytics-driven world revolves around collaboration. Together, we can create solutions that move entire industries forward.

If you’re attending the 2016 Intel Developer Forum in San Francisco, you will have multiple opportunities to learn more about the work Intel is doing to enable open source analytics projects. In addition, to dive directly into some examples of the power of big data analytics based on open technologies, visit the Intel New Center of Possibility site.

Martin Hall is the director of big data solutions at Intel Corporation. Connect with Martin on LinkedIn and Twitter to find out more about how advanced analytics can give your business a competitive edge.


[1] Up to 30X gain based on internal testing with LeTV Cloud (www.lecloud.com) for video detection.  Results based on system using Intel® Xeon® processor E5-2680 v3.  LeTV Cloud Caffe* optimization compared baseline using BVLC Caffe + OpenBlas to Caffe optimized for Intel® Architecture + Intel® MKL.