Accelerating the Computational Network Tool Kit with Intel® MKL

As part of my previous blog post on Machine Learning: A Full Stack View, I discussed the importance of having a set of highly tuned libraries to extract the maximum performance out of our hardware.  Our Intel® Math Kernel Library (Intel® MKL) and Intel® Data Analytics Acceleration Library (Intel® DAAL) libraries are especially important to accomplishing this task – whether you are in need of a super fast GEMM or convolution primitive or a fully-fledged and optimized machine learning algorithm like a boosted decision tree.

Intel MKL, especially germane to underpin the performance of deep learning frameworks, includes highly vectorized and threaded Linear Algebra, Fast Fourier Transforms (FFT), Vector Math and Statistics functions. We are also continually adding new APIs, primitives, and improving performance.  Important operations being added and/or optimized for deep neural networks include a variety of convolution types, highly optimized and efficient dense linear algebra operations, activation functions, and many others. A free community license for Intel MKL is also available, making it far easier to consume within your project or application.

Today I am very happy to say that Intel MKL is now the default math engine powering the Computational Network Tool Kit (CNTK). Although it is a relatively new framework, CNTK has already garnered significant attention from the community due to its performance, scalability and strong backing by Microsoft.

CNTK’s supported applications include automatic speech recognition (ASR), machine translation, image recognition and captioning, text processing and relevance, and language understanding and modeling. Of course, because it is a fully featured tool kit, the sky’s the limit as far as additional applications that can be dreamed up.  Neural network types include all of the popular architectures including FNN, CNN, LSTM/RNN as well as some unique ways to compress node-to-node communication to a single bit (1bit) which dramatically improves multi-node scalability. CNTK is also deployed at scale for many of Microsoft’s most important and highly visible production applications such as Bing* Search, Skype* Translator and Cortana,* among others.

As of the time of writing this blog, CNTK has been forked on github more than 1100 times, received more than 5600 github stars and has nearly 6500 commits by 68 contributors.  The future certainly looks bright for the CNTK project and I fully expect the community to continue to grow as the project matures and new uses emerge.

Instructions are available today on how to build CNTK with Intel MKL on Windows* and Linux* based platforms, and both CNTK and Intel MKL are fully redistributable.

This is the first step in Intel’s efforts to contribute to the CNTK project and we are very excited to help enable the deep learning community to leverage this amazing tool kit. Keep an eye out for more contributions from Intel to this very important project. As we announced at ISC and ICML this week, we’re expanding our commitment to the machine learning community through continued optimizations of Intel MKL for common machine learning primitives.  And check out our additional machine learning resources and developer tools available online.

I’d like to personally thank Alexey Kamenev and Wolfgang Manousek at Microsoft Research for the great support and continued collaboration.  Cheers!