Driving the Convergence of AI, HPC, and Big Data

Those working in HPC are facing a convergence challenge. Diverse workloads from across AI, modeling and simulation, visualization, and big data analytics must increasingly be run concurrently, and on the same architecture.

While each of these workloads has its own unique needs, challenges, and potentials, it is important to think about how the supporting infrastructure can converge, not only from a hardware, but also from a software perspective.

This journey to convergence is being driven from (at least) two angles at once.

The financial drive for convergence

IT decision makers and HPC leaders are under constant pressure to reduce the total cost of ownership (TCO) of their resources. The challenge they face is doing this while also maintaining the necessary compute, fabric, memory/storage, and software capabilities their users need. On top of that, they have the added challenge of getting all of these elements to work together across different environments—on cloud, on premise and especially on hybrid cloud.

The data drive for convergence

We all operate in an increasingly data-centric world, in which organizations must extract the maximum possible value from their available data. This means that more groups within the organization than ever before have an interest in where and how data is gathered, managed and used.

The result of this is that firstly, HPC teams may find their existing users want to do more with their data than before – for example a team running genomics research may wish to introduce AI techniques like machine learning to their existing HPC workloads to help enhance and accelerate their results. Secondly, the availability of cloud-based applications for AI, visualization and so on mean that entirely new user groups are emerging with demands for HPC capabilities to support their new initiatives.

Intel® Infrastructure for Convergence

In order to help all these stakeholders maximize the value of your data, it’s important to have an open standards-based strategy to maintain the flexibility, scalability, and future-readiness that users today need. At the same time, it needs to be cost-effective. Time- and budget-constrained organizations don’t have the luxury of investing heavily in brand new infrastructure on a regular basis.

The good news is that it’s possible to get ready for the convergence of AI, HPC, and big data using products built on your existing Intel® architecture as the foundation. This has already been recognized in the industry, with the increasing adoption of Intel® Xeon® processor-based systems in the top 500 Supercomputers, demonstrating that it is a strong choice for price, performance and resiliency. And when it comes to AI workloads, Intel® Xeon® Scalable processor family, combined with Intel-optimized AI frameworks, delivers powerful, highly parallel performance for machine learning (ML) and deep learning (DL) workloads.

Organizations interested in optimizing their HPC infrastructure for AI and other workloads should also consider Intel® Optane™ DC Persistent Memory, which is optimized for these new types of data-heavy applications and will be available in 2019. It combines the attributes of memory and storage with low latency, high endurance, outstanding quality of service and high throughput. This creates a new data tier, and one which is especially applicable for the applications like advanced analytics, AI, and HPC.

Intel® Omni-Path Architecture (Intel® OPA) is designed to overcome the drawbacks of traditional standards-based HPC fabrics.. Intel® OPA is designed to scale cost effectively from entry-level HPC clusters to larger clusters with 10,000 nodes and up.

The journey to a fully converged HPC/AI/big data platform will be different for every organization, and will depend on the combination and fine balance of needs within your own organization. It won’t be achieved overnight, but happily it is a journey that can you can start today with your current Intel® Xeon® processor-based HPC infrastructure. Start small, experiment, and scale when you’ve found a path that works for you.

To learn more about bringing AI into your existing HPC environment and scaling it up read this eGuide.