Big data has already made fundamental changes to the way businesses operate. There are huge advantages for companies who can derive value from their data, but these opportunities come with challenges, too. For some, this is the challenge of acquiring data from new sources. For others, it is the task of building a scalable infrastructure that can manage the data in aggregate. For a brave few, it means extracting value from the data by implementing advanced analytic techniques and tools.
For cloud service providers (CSPs) whose business depends upon solving these challenges, the scale of online user-generated data inspired the development of a radically different hardware for datacenter infrastructure and a new kind of software for orchestrating workloads intelligently and efficiently on that infrastructure. When these cloud computing technologies – designed to increase datacenter automation -- were released to the open source community, they spawned projects such as Docker*, Kubernetes* and Mesos*. At the same time, CSPs developed data storage and processing software that could handle the speed and scale of human-generated data. Apache Hadoop* and Spark* are children of these Big Data technologies. The concurrent rise of Data Science as a profession stems from the acute need to detect signal in the noise of this ever-increasing flood of data.
We now face a wave of data generation that is several orders of magnitude greater than the cumulative tracks of surfers, shoppers, and their social networks. We look with awe upon the data generated by smart phones, driverless cars, industrial drones, cube satellites, smart meters, surveillance cameras, and millions of other things that now populate the Internet. And after the shock subsides, we realize that the level of automation that allowed us to manage data in the cloud era must now scale several-fold in order to analyze data in the era of the Internet of Things (IoT).
Automation is the key to solving the challenge posed by IoT. We need things to get smarter when responding to their environment and users. We need systems to become more intelligent based on the history of their interactions. We need technologies and tools that can help these devices and systems learn from their experience. Where once we asked analysts to generate “insights” from data and make decisions that drove changes in system operation, now we must ask systems to learn from data automatically and respond appropriately. In short, we need Machine Learning to make IoT usable.
Machine Learning – the study, development, and application of algorithms that improve their performance of tasks based on prior interactions – is the key to making things that learn from experience and get “smarter” with use.
Consider the example of autonomous vehicles. They construct a model of the world based on data from millions of miles of driving by test cars equipped with sensors such as Radio Detection and Ranging (RADAR), Light Detection and Ranging (LIDAR), and cameras. They use data from maps to plan paths. But they are not programmed explicitly with rules for every scenario they might encounter in the real world. For cars to operate autonomously, they must be trained, much like human student drivers, to recognize objects in the visual field such as other vehicles, highway signs, lane markers, trees, and pedestrians. They must learn to navigate and control the movement of the vehicle in response to dynamic conditions. And much like a student driver, they learn by making mistakes and improving their accuracy with practice. At first, a trainer – the data scientist – annotates the training data to label correct responses and supervises the learning process of the algorithms that make up the model. But eventually, the model learns to recognize the objects, localize them in space, and track their movement well enough to operate in the real world.
Another example of machine learning application is in the fraud detection mechanisms used in the payment card industry. They use machine learning algorithms to track the purchasing patterns of each card user and to detect anomalies in that pattern. For example, a fraud detection system might flag a transaction as suspicious if it involves an atypical purchase amount for that user or originates from an unusual geographic location or an unfamiliar merchant. In more complex situations, a machine learning algorithm may build a complex model based on big data from transactions across the entire population of users to improve the accuracy of fraud detection.
Interestingly, the vast diversity of such machine learning problems rests on the foundations of relatively simple algebraic operations on matrices. They challenge lies in being able to handle matrices that are often large but sparse, or sometimes dense but “tall and skinny”. And the troublesome fact that these operations need to be performed at big data scale in a matter of milliseconds. Often this is constrained by algorithmic complexity – the order of magnitude of time that it takes to complete the algorithm regardless of how well it is coded. But inevitably, we need a storage and compute infrastructure that is up to the challenge.
So how do you get there? To build a machine learning solution, you need sensors and systems that collect data from the edge, whether that means an autonomous vehicle on a highway or a point-of-sale device at a retail store. The sensors need to relay the data to cloud-hosted platform designed to handle data at scale. You need models based on machine learning that can learn from the data to make inferences about new data. You need the machine learning algorithms to be implemented for speed at scale. You need the mathematical operations -- the computational kernels of these algorithms -- to take advantage of processor features to get the best performance out of the system hardware. You need systems equipped with processors with multiple integrated cores, faster memory subsystems, and architectures that can parallelize the processing.
And then, as your data grows, you need scalable clusters of these systems that allow you to train a complex model based on machine learning over a big data set distributed across a large number of systems.
These characteristics are at the heart of the systems based on the new, second-generation Intel® Xeon Phi™ processor. As a server-class product designed for high performance computing, the Intel Xeon Phi processor delivers performance needed by some of the most demanding workloads – including machine learning. The Intel Xeon Phi is especially optimized for a subset of machine learning known as deep learning, where the algorithm takes the form of a multi-layered neural network composed of non-linear functions. A cluster of Intel Xeon Phi processor-based servers can reduce the training time by orders of magnitude compared to single-node client-grade processors. With 68 active cores, just one 4-socket system based on Xeon Phi processors can handle a complex training tasks easily but a cluster of these systems scales powerfully with the data.
Developers can extract maximum performance from Intel hardware by using the library of math kernels and optimized algorithms from Intel called Intel® Data Analytics Acceleration Library (Intel® DAAL) and Intel® Math Kernel Library (Intel® MKL). These libraries include implementations of fast Fourier transform (FFT), generalized matrix multiplications, statistical tests, and several classic machine learning algorithms that improve the performance of a wide range of higher-level ML algorithms and deep learning topologies.
Moreover, developers working with deep learning frameworks such as Caffe* and Theano* can benefit from the work of Intel’s software developers who integrated Intel MKL into these frameworks. By using the Intel-optimized frameworks supported by Intel MKL, I’ve seen customers get performance on deep learning network topologies including convolutional neural networks (CNN) and recurrent neural networks (RNN) that is an order of magnitude greater than running these frameworks un-optimized on commodity CPUs. Our code modifications to the DL frameworks are open source. You can get the Caffe and Theano code optimized for Intel® Architecture from GitHub.
Also, I’m part of a growing team at Intel that can help CSPs and enterprises in the transportation, financial services, healthcare, energy and other vertical industries build and deploy solutions based on machine learning models. For qualified organizations, we can provide test and development platforms based on the Intel Xeon Phi processor, software, tools and training, as well as reference architectures and blueprints to accelerate the deployment of enterprise-grade solutions.
Machine learning will soon become a survival skill in the IoT era. To build and sell products and services to customers and retain their loyalty, organizations need to make things smarter. And the best known technology for making machine smart is machine learning. We at Intel are on a mission to make it easier for you to develop and deploy solutions based on machine learning and deep learning using all the capabilities of Intel compute, storage, and networking components as effectively as possible.