Innovating the Thermal Management of High Density Servers

It's great server weather here in New Mexico today. The current temp is 74f , the high will be ~82f ,the low was 57f ,and the humidity will be ~30% all day. These are all "in range" of the air needed for a densely packed server to breath. Depending on how many servers are packed into a rack, they can heat this 74f degree air by as much as 50f degrees. Removing that heat from the server exhaust air consumes energy and expensive equipment. A key system trade off in data center design is that density reduces the unit cost of deploying and operating servers for most inputs but increases the cost of dealing with high density heat. If we can find ways to address this heat with less energy and cost then we can be more economical, and in today's words, more green.

It was on a day like today, a while back, where Tom Greenbaum, a fellow Intel engineer, and I were brainstorming how to get outside air directly to the servers. Because of Tom's experience in custom air handlers, we were focusing on economizers that are normally used for buildings. Economizers are commonly designed into office buildings, homes and in a simple fashion, most cars, but they are rarely are thought of for data centers. The key concept is that you have access to two air supplies: the outside air and the exhaust air. To be most economical, you want to continually decide "what is the best air to use next?". This takes measuring, deciding and switching. In your car, you are both the measurement device and decider as you press the re-circ button or the flow-through button on the dash. In a highend building air conditioner this is an outside weather monitor, an inside weather monitor, a simple controller and some extra duct work.

In our experiment we used an vane controller in a looped air duct that could blend outside air with the exhaust air to get us the best air for the servers to inhale at the lowest cost. The accuracy and controllability of our monitors and controllers were not as capable as we wished, but they did the job. The perfect controller would have measurements of outside air and exhaust air for: temperature, enthalpy and dew point. It would have policies that you could modify for best economy based on the requirements of your equipment. For example, it should be able to blend exhaust heat into the outside air to hold the minimum temperature at say 55f and no more in winter and it should be able to start incremental cooling loads as the temperature of the coolest of the two supplies rises above the maximum allowed by your equipment, in summer. It would decide to use some exhaust air when the dew point was higher than the outside temperature to control air humidity. You get the point, it would condition the air using all available inputs. Our PoC used DX cooling based units that usually are considered not as economical as water based cooling. But, in this mode of operation, they worked well and reduced complexity. In addition they used no water which is a plus in many desert locations. You can imagine evaporative systems in similar designs that could replace the DX units or work with the DX units for even more "economi-zation".

In the video you will see dust on the back of the servers. We had filtration on the air intakes and a control system that can indicate when the filter needs replacing, but we had a door system that let in dust near the inputs during really windy days. It was a design flaw in our temporary room. Once it got into the system we decided to let it run to see where it went. It collected in the exhaust areas but then created very little risk becasue most of the time we exhausted the air. Something that would not have been true in a closed loop system.

Don Atwood was able to negotiate for and create the production capable configuration for a sufficient number of production servers that were dense enough to run a proof of concept (PoC). These servers run high volume batch computing and are nearly always running above 90% utilized, perfect heaters for the job. It is important to note that this PoC was dependent on the concept of a "Compute Center". A Compute Center is the idea that high density servers can be isolated in their own air space a very short distance from the storage. The storage is left in the classical, and perhaps now more aptly named data center. Where this concept is able to be used, it can help free up traditional close loop environmental control for storage systems. If anyone knows of a great economizer controller, please lets us know. An Atom based design would be a plus

The temporary "compute center" we established and operated would not have be successful without the help of several great engineers contributing insight and innovation. Tom Greenbaum, Marvin Bailey, Steven Bornfield, Natasha Bothe, Greg Botts, Demetruis Ferguson, Ryan Henderson, Dan Links, Don Wright all contributed to learning what is possible during our PoC.

Well, it's 79f outside now and still great server weather, there are hot air balloons in the sky this morning, as we get ready for the Balloon Fiesta here in early October. Hot air balloons, like racks of servers, love this air becasue they can inhale cool air and then heat it to create work that moves people. Finally hot air balloons spend very little energy exhausting the resulting heat out the top. Its a simple model really, it just takes a smart controller.

Video:

Paper :