Rethinking Supercomputer Performance and Efficiency for Exascale

The biggest challenge facing high performance technical computing is to deliver an Exaflop per second. What makes the problem challenging is not just the achievement of that scale of computing performance, but to do it within a “reasonable” power budget of 20MW as Kirk Skaugen recently announced .

It’s a performance goal that cannot be achieved without an efficiency breakthrough. But the problem is more than just efficiency. As one of the smart guys I work with at Intel is fond of pointing out to me (whenever I get too crazy about efficiency), his wrist-watch is extremely energy efficient - but also not useful for computing.

The story is about performance and efficiency. Neither is sufficient. Both are necessary.

So I asked myself, which systems are closest to achieving Exascale goals and how can a rank order be established?  Is there an easy way to look at how close we are to that goal?

A good place to start is the Top500, which ranks on performance, and the newer Green500, which ranks solely on efficiency.

The problem with those metrics is that they are mutually independent. For instance in the Top 20 Green 500 you have servers that are near the bottom of the performance heap. And in the upper eschelons of the Top500 you have many inefficient systems.

For our purposes the separation of the Top500 and Green500 does not provide insight to the Exascale goal.

So I started messing around with several way to look at the data over a weekend. I thought I would just share here the one way to look at things that seemed fruitful,

In the graph below I’ve just taken the efficiency and performance data from most recent Green500 List and plotted performance against efficiency on a log scale.

Exascalar Graph.jpg

Please click the image for larger view

To the graph I have added four things: 1. the publicly stated Exascale performance and efficiency goal, 2. an arrow indicating a scalar quantity gauging how “far” each point is from the Exascale goal logarithmically,  3. iso-power lines at 2MW and 20MW, and 4. The boundary of the “Top20” based on the scalar value.

One thing I like is about this representation is that systems having either low performance or efficiency are naturally excluded. To rank highly, you need good performance and efficiency.

Does this approach tell us anything new? One way to tell this is to look at the Top10 based on this ranking. NOTE: This is not intended as formal re-analysis of the data - this is a blog exploring a concept only. The Table below shows how the systems stack up.

Exascalar List.jpg

Please click the image for a larger view

It is clear that while the top “exascalar” ranking closely aligns to performance, efficiency does have a risruptive effect on the ranking. Systems with more balanced scores tend to move up the “exascalar” ranking (for instance the GSIC HP Proliant system, which ranks 4th in efficiency and 5th in performance, moves up to 3rd in this scheme), whereas relatively inefficient systems, even with high performance, tend to move down.

Looking back at the data in the graph, it is certainly intriguing that systems with relatively low performance are, in fact, nearly within the top 20 range of for this “exascalar.” So efficiency leadership may count for something if it can ultimately scale in performance.

What I like about this approach is the easy interpretation of the scalar value - the “number of orders of magnitude” remaining to Exascale. It's an efficiency and a performance problem. A value of three means a factor of one thousand away from the goal of delivering 1 Exaflop in 20MW.

It remains to be seen how the Exascale challenge will be won, of course. New announcements of plans to improve systems are coming out regularly.  Perhaps this approach, or one like it, which looks at both efficiency and performance in the nose-bleed range of supercomputing, will get us beyond looking at performance or efficiency separately and help us to understand which architectures, systems, and approaches are best closing the gap to the solution. And that ultimately will translate to a win for everyone.

As always, I’m interested in your thoughts and insights.

Is it helpful to understand how close systems are to exascale levels of capability. Does“exascalar” approach provide insigh into that? Does it address the question, “which system is closest to achieving Exascale goals of performance and efficiency?” better than looking at power and efficiency separately?  How would you improve things? (for instance one could plot power instead of efficiency, but when I looked at it it seemed to provide less insight). What alternatives schemes might be proposed as way to look at Performance and Energy Efficiency in supercomputing and what insights do they offer?

Comments are welcome.