Making Exascale Computing a Reality

Everyone loves a good race. After years of speculation and expectation, we are now on the cusp of the exascale era of computing. Leading national labs are accelerating their exascale deployment plans, with China pulling in their estimates to hit the exascale mark, and the United States accelerating its deployment forecast for an exascale class system to 2021 at Argonne National Laboratory as part of the CORAL program. Recently, the US Department of Energy announced their intention to base this system on Intel® architecture, a mark of their significant confidence in the Intel roadmap and portfolio of technologies for exascale-class computing.

Due in part to our collaboration with Argonne, Intel is on track to be the foundation of the first exascale machines in the US, and we are in a unique position to understand both the opportunities and challenges that delivering exascale computing will pose.

Why Exascale?

Exascale computing is an unparalleled leap forward in computing power. But rather than thinking about it in terms of exponentials (going from 1015 to 1018 floating point operations per second), it’s more helpful to think about what this amount of processing power will enable: Insights around massive data sets, or modeling systems and phenomena too complex for current systems. For example, more than eight million measurements are taken from the biopsy of a tumor in a typical cancer study. This generates a flood of data that our current petascale supercomputers can’t quickly digest. With exascale computing, cancer researchers will be able to simulate key protein interactions at the molecular level, develop models to predict how tumors will respond to specific drugs, and automate the analysis of millions of cancer records to identify optimal treatment strategies. Exascale analysis can, for example, provide information similar to what might be obtained from a tumor biopsy – accelerating via simulation an otherwise time-consuming and potentially physically painful process. Exascale computing power will enable new approaches to data analysis, delivering new therapies and improving the lives of countless people affected by cancer1.

The convergence of artificial intelligence (AI), analytics, and traditional HPC workloads will, with exascale computing, allow us to address massively complex scientific challenges and drive economic and societal benefits in every industry. We will be able to more accurately simulate earthquakes so we can assess risk for infrastructure and populations, perform crop analysis to help make farming more efficient and better feed the world’s growing population, and even explore the fundamental laws of nature. One additional benefit: We also have seen time and again that technologies and programming techniques developed for the highest echelon supercomputers have a trickle-down effect to data centers and client computing2.

Considerations and Challenges for Exascale Computing

Today’s fastest systems deliver approximately a sustained 90 petaflop performance. With exascale, we’ll need to deliver >10x that. Why not just throw 50,000 more nodes at a current supercomputer?

There are a number of reasons why node scaling alone won’t get us to exascale:

  • Power efficiency: This is potentially the largest concern for exascale systems. If we scaled current systems to exascale they would run somewhere in the 200MW range, or roughly the equivalent of a moderately-sized natural gas power plant (at a cost of $200 million annually to run). Our current exascale target is somewhere in the 20 – 40 MW range, addressed via hardware and software.
  • Physical footprint: How physically big is the system and how heavy is it? If it’s water cooled, how much water will it require? These are questions that must be addressed to even begin narrowing down a location to host the system.
  • Interconnects: We need to be able to send data through millions of components in an exascale system. The amount of data moving around in a single exascale system will swamp the amount of data moving over the worldwide internet. How do we get the data where it needs to be without massive power, cost and latency issues moving from processor to memory to storage?
  • Memory: As we focus more on application performance (vs. peak benchmark performance), memory hierarchy becomes critical. We’ll also start to see emerging new classes of memory such as High Bandwidth Memory* and Intel® persistent memory. These can provide exceptional performance and cost advantages. But, no single memory technology will cover all usages. These new opportunities in memory will allow us to address the “memory wall” which has been anticipated for a decade.
  • Processing power: We need a lot more of it. We are close to continuing the trend of 1000x performance improvements at the system level every decade. These improvements are coming from every increasing level of parallelism in our processors. Fortunately, the concurrency in large-scale applications is extremely high. Exploiting this evident concurrency in a way that is accessible to the user is the key driving force for new hardware and software architectures.
  • Reliability: With the increase in processing elements comes the immediate challenge of maintaining reliable and highly available systems. Exascale systems will be based on decades of advances in hardware and software techniques that have allowed for current systems with millions of processor cores to operate reliably. New techniques that range from circuits to software will be employed to allow us to continue to deliver highly usable systems.
  • Programming (and scaling): Exascale programming will need to work with legacy systems so existing users see immediate benefits without learning a new programming language. Further, developers will have a head-start when programming atop the well-known, well-supported, open-standard software foundation enabled by Intel architecture, giving a major advantage over proprietary architectures. There will also be opportunities to utilize new user program models that have been co-designed with the new hardware technologies to open up new dimensions of capabilities and performance.

We won’t reach exascale by solving one problem – we must address a host of inter-related challenges. We need to look at exascale holistically from a complete solution standpoint.

With exciting new technologies emerging in the exascale timeframe, we need to use these building blocks in the optimal way that accommodates their individual constraints while leading to an overall system optimization point. It is also true that workloads and users will vary in which aspects of the system are most important. Future architectures will allow unprecedented degrees of configurability to allow for an additional dimension of optimization.

An Architectural Foundation for Today and in the Future

Intel is committed to providing a holistic foundation for exascale systems that will draw on the best of our technology across compute, storage, interconnects, and software. But, simply solving the technical challenges isn’t enough. Fully realizing the promise of exascale will require contributions from the entire HPC community. Today, Intel has the most robust vendor and application ecosystem. We must make sure that tomorrow’s exascale systems enjoy the same ease of access, breadth of application support, and freedom of choice available today, so that innovators of all stripes will be able to best harness the power of exascale to create the next wave of breakthrough discoveries that will enrich all of our lives.

You can learn more about Intel’s HPC portfolio at




Published on Categories High Performance ComputingTags , , , , , , , ,
Al Gara

About Al Gara

Dr. Al Gara is an Intel Fellow and Chief Architect of E&G Advanced Development, a part of Data Center Group at Intel Corporation. In this capacity, he is leading a team of Intel architects in system pathfinding the future for Intel Xeon® Phi™ processor compute directions. Additionally, he is leading the Intel team responsible for delivering the Coral system. Prior to joining Intel in 2011, Dr. Gara was an IBM fellow and Chief Architect for three generations of the Blue Gene platforms which was awarded the National Medal of Technology and Innovation in 2008. Al has been the Chief Architect on more than 1/3 of the Top10 systems over the last 10 years as measured by the Top500. Al has received two Gordon Bell prizes (1998 and 2006) and the Seymour Cray award in 2010. He has over 70 publications in computer science and physics and more than 130 US patents in the area of computer design and architecture. Gara received his Ph.D. in physics from the University of Wisconsin, Madison in 1987 for his work calculating the meson mass spectra utilizing a relativistic Bethe-Salpeter approach.