Impossible Until They’re Not—How AI is Evolving with HPC

 

We live in a world in which artificial intelligence (AI) is now a part of our daily lives. This is a huge change that has taken place in recent years. Growing up, we used to think of many of the capabilities we have today as the stuff of fiction, part of Hollywood productions such as “Star Trek.” Today, many of these same things are now either a reality or less than a decade away.

When we use digital assistants and navigation engines in our daily lives, we usually don’t realize we are using artificial intelligence. And the truth is, as consumers we don’t have to. We can focus on living our lives, while on the backend artificial intelligence is giving machines the ability to sense, reason, act, and then adapt based on experiences. Not that long ago, this kind of intelligence in computers was the stuff of science fiction. Today, it’s the stuff of everyday life.

The Connection between HPC and AI

While today’s AI solutions are making our lives better in countless ways, they are not a final destination. They are part of an ongoing journey. There is now tremendous potential for high performance computing to accelerate machine learning and make it practical for any business or research project to embed advanced analytics to create a smarter world for all of us. Here at Intel, we are laying the groundwork for this leap forward and a whole lot more with an entire portfolio of hardware and software offerings. I will talk more on the Intel contributions in Part 2 of this blog post. For now, in Part 1, let me try to bring this closer to home with by diving into one area of artificial intelligence.

One of the key pieces of technology that is behind many of today’s intelligent services is speech recognition with embedded natural language processing. Speech recognition has come a long way from splitting a word into its phonemes and leaving a powerful workstation to train on a vocabulary of a few hundred words overnight. Today, as we speak to our digital assistants, our expectation is that they understand us without having to limit our vocabulary or have the same accent on which the speech recognition model was built.

While this is a great step forward, it comes with a challenge: the need to build sophisticated models to recognize what we say has increased multiple-fold. Fortunately, and thanks to Moore’s Law actually, we now have compute capability that can handle the intense demands of artificial intelligence. When you add the ability to execute machine learning through high performance computing on top of that, we are paving the highway to interacting with our digital agents as smoothly as we talk to our colleagues.

To make this more tangible, I will give you some rough estimations of how much data and compute power it takes to train a model. To build a decent speech model, you need thousands of hours of speech data. There are experiments that show that on a system that uses less than 100 hours of speech data (yes, an order of magnitude less than what is required to build a decent model), your training will take about half a day on a state-of-the-art computer. At that rate, we can say that a decent model would take weeks to train. Imagine waiting several weeks to see if you actually had success in your training and then starting again with additional data. This is akin to waiting for wine to mature after you pressed your grapes, but digital data changes quite a bit more frequently than grape varieties.

Let’s not give up yet though. If the computation behind any of the artificial intelligence techniques can be divided into pieces that can be computed simultaneously, we have a way to speed things up. And that is by using multiple computers networking with each other to solve a single problem, which by definition is how high performance computing works.

This is why all scientists working on creating intelligence should not shy away when they hear the phrase “high performance computing.” Our brains do not process visual cues or sounds using one neuron at a time, so why should we impose that kind of a restriction when we are trying to simulate intelligence?

We now have the ability to leverage HPC techniques using the well-established paradigms of Message Passing Interface (MPI) and distributed programming models to actually provide a tremendous boost to the ability to train models. If we also help them take advantage of hardware capabilities with techniques, such as vectorization and cache blocking, which increase the speed of backward and forward loops used in training, that takes the scientist’s abilities several notches up. What that means is shorter times to train, and even higher quality and more sophisticated models.

Using HPC in Machine Learning

At Intel, our vision is to bring the scalability and the benefits that the industry learned through traditional HPC—from workstations to supercomputers—to any artificial intelligence application. And I’m happy to report that lots of organizations are trying to get on board. Some companies are open about their usage of HPC in the intelligence that gives them a competitive edge.

For example, organizations like PayPal, Baidu, and the United States Postal Service (USPS) are already using HPC to process data and compute-intensive applications that form the backbone of their businesses. More specifically:

  • PayPal uses HPC processing in its online payment transactions to identify fraud patterns and predict fraud in real-time. This drives real-time action and prevention tactics.
  • Baidu is using machine learning, coupled with HPC, in an effort to increase its speech recognition accuracy in noisy environments.
  • And USPS is using a supercomputer to perform advance analytics for sorting and routing at more than 15,000 post offices and delivery facilities. The goal is to reduce errors and provide dynamic routing optimization that has a very real impact on the bottom line.

Key Takeaways

In a follow-up post, I will look at some of the specific contributions that Intel is making to enable data scientists and analysts to take advantage of HPC capabilities in AI solutions. For now, here are some key points to keep in mind:

  • Using HPC for artificial intelligence is a critical win on multiple fronts.
  • Scientists can train their models in less time with more data, which can help increase accuracy.
  • The models can tackle more complex inferences and solve harder problems with more and deeper layers in the neural network (think of this as analyzing a very complex image instead of a hand-drawn picture)

If you have thoughts you’d like to share on the use of HPC in AI, my team would love to hear from you. To get the conversation going, send me a note via my Twitter account: https://twitter.com/UlgenFigen