Measuring Data Center Efficiency: A Maturity Model for SUE

One of my more popular blogs earlier this year was about “The Elephant in your Data Center" -- inefficient servers. As I explained, older, inefficient, under-performing servers rob energy and contribute very little to the “information-work” done by a data center.

Almost everyone already knows that, of course. The contribution of the blog was to take a potentially complex idea (relative server performance) and build a simple way to access it.

The blog proposes a metric called SUE (Server Utilization Effectiveness). We build the idea based on practical experience with lots of input from our Intel IT and DCSG experts. The notion was very similar to Emerson’s CUPS metric with the added twist to normalize so that SUE = 1.0 was ideal and larger numbers were worse (consistent with the way PUE is defined, for better or worse!). Mike Patterson and I discussed some of the benefits of the SUE  approach in a recent Chip Chat podcast on data center and server efficiency with Allyson Klein.

The overarching message is that SUE complements PUE in the sense that PUE looks at the building infrastructure efficiency, and SUE looks at the IT equipment efficiency in the data center.

The proposal for SUE was primarily oriented around usability. We wanted a way to go into a data center an make an assessment quickly and for a low cost. So, we focused on a simple age-based metric for relative performance. The simplification got a lot of comments, and one was, “what if I want more precision?” The good news is there are answers out there for you. I summarized the results of the discussion below:

SUE Maturity Model.jpg

I chatted with Jon Haas here at Intel about this problem. Jon leads the Green Grid’s Technical Committee where he and industry partners are collaborating to run experiments on more accurate Productivity Proxies for server work output. Of course, running a proxy on your server configuration is something that might take longer than a few days, and would occupy some precious engineering resources. But given the high operating and capital costs, the accuracy benefit in many cases will make solid business sense.

There are other ways to measure server and data center performance. A common way to estimate server performance and efficiency is to look up published benchmark scores. Depending on the server model, configuration, and workload type of interest, these table look-ups can be accurate without consuming a lot of time and resources.

And finally, many advanced internet companies instrument their applications directly to monitor performance. This represents the highest investment level, but produces the highest accuracy.

In all cases, the normalization of the actual server performance to the performance of state-of-the-art servers will produce numbers that can be correlated to SUE in the manner discussed in my previous blog and podcast.

The good news is that you can find out more about progress on the proxy front, and more at the upcoming Green Grid Forum in San Jose this coming March.

As always, I welcome your comments. The idea, as originally proposed, was closer to conceptual than realizable. Yet, taking into account a maturity model, I think it starts to have legs as something which can be standardized. What do you think?