This post originally appeared in Information Management on December 26, 2012
Politics aside, the subject of energy is of great concern in every large data center. Why, then, is power consumption still an afterthought for most server deployments? Because IT and facilities teams typically work independently and neither team can control consumption or predict requirements when data center energy costs are buried in the overall utility bill.
Let’s face it: Energy costs are spiking, server sprawl is pushing against site capacity limits, and the Internet and smart device adoption rates are calling for aggressive increases in data center compute densities. Industry analyst firms agree that power and associated cooling requirements account for the fastest increasing components of operational costs. To protect the bottom line, and to comply with the latest EPA Energy Star standards, data centers need to change the way they monitor and manage energy consumption for power-hungry assets, like servers. New power and cooling management approaches are available that offer greater energy efficiency and reduced costs.
Traditional approaches to managing power and cooling have failed to control costs, in large part because they typically force over-budgeting to ensure priority needs are met. Ironically, even with overestimating and over provisioning cooling, data center hotspots continue to crop up, thereby impacting server availability, reducing data center cooling efficiency and driving up operational costs. These factors and their impact demand that facilities and IT professionals find a better way to achieve their common objectives.
Zooming in on the Right Measurement Points
One of the most fundamental barriers to achieving greater power efficiency and curbing runaway energy spend rates has been the inability to obtain accurate readings of actual server energy consumption levels. Various models have been developed that translate temperature and power consumption into overall data center energy requirements for servers and their associated cooling systems. However, even the best of these models lack the real-time visibility required to accurately understand and predict energy trends. Actual usage can vary significantly (up to 40 percent) from modeled predictions, and the models do not provide the immediate feedback required to pinpoint hot spots before they impact services or identify areas of waste where conservation can lead to savings.
Energy models are limited in terms of day-to-day management of power consumption. For example, we know from in-field measurements that an average of 15 percent of data center servers are “ghost” or “zombie” servers (servers that are not producing useful work, drawing energy just to stand idle). When we do the math, assuming that a server draws approximately 400 watts of power, which currently costs about $800 per year, companies are spending on average more than $24 billion per year for these “ghost” servers in their data centers.
Aggregating Server Thermal and Power Data
A second problem has been the technical challenges of aggregating data from varied and disparate systems within the data center. Facilities managers have been forced to cobble together, manually or with crude homegrown systems, vital data such as power supply of the server, inlet and outlet temperatures, asset information contained in RFID tags as well as temperature and humidity readings of the air conditioning units. This prevents the achievement of a “big picture” perspective of facilities’ server inlet temperatures and power consumption data from rack servers, blade servers, and the power-distribution units and uninterrupted power supplies related to those servers. The crippling effects of this piecemeal view are analogous to a long-distance truck driver suffering from tunnel vision.
By shifting attention from the cooling systems to the servers which account for the majority of the power consumed in the data center, managers can introduce a holistic energy optimization solution. Accurate monitoring of power consumption and thermal patterns creates a foundation for enterprise-wide decision-making with the ability to:
- Monitor and analyze power data by server, rack, row or room;
- Track usage for logical groups of resources that correlate to the organization or data center services;
- Automate condition alerts and triggered power controls, based on consumption or thermal conditions and limits; and
- Provide aggregated and fine-grained data to Web-accessible consoles and dashboards for intuitive views of energy use that are integrated with other data center and facilities management views.
Identifying temperatures at the server, versus at the room or even rack levels, can also help data center managers more accurately understand what the real ambient temperature should be for individual servers to have optimal life spans. This assessment of real temperatures has enabled data centers to increase the overall room temperature by one to two degrees, which can create significant savings in the air-conditioning expense.
Disseminating the power and cooling data, without impacting ongoing processing in the data center, is another challenge. Invasive monitoring approaches have the potential for adversely affecting the performance of existing systems. Agentless monitoring capabilities should have little impact on the overall system performance, and therefore are virtually undetectable to the end users’ experiences.
Where should all of this energy monitoring and aggregation functionality be placed within the data center? Ideally, all of this would take place transparently and non-invasively, to avoid impacting the servers and end users. Agentless approaches, without the need for any software on the managed nodes, are available. Data center managers should also look for solutions that are easily integrated, such as those based on Web Services Description Language APIs, and able to coexist with other applications on the designated host server or virtual machine.
Where Power is Going
Today, the goal is improved efficiency and reduced costs, but energy management will become even more critical in the future, as compute models continue to tax power infrastructures. Whatever the goal, the monitoring and aggregation of server energy metrics set the stage for much more comprehensive energy management and a far deeper and richer set of usage models for IT assets. Besides enabling accurate power planning and forecasting, logging and trending power data provides knowledge for data center “right-sizing” and accurate equipment scheduling to meet workload demands.
The thermal data can also be used for more efficient designs of integrated facilities systems, such as cooling and air-flow solutions. Optimized resource balancing in the data center will always be closely tied to power; the expanded insight offered by intelligent energy management approaches will contribute to cost-saving decisions for years to come.
Jeff Klaus is the director of Intel Data Center manager (DCM). Jeff leads a global team that designs, builds, sells, and supports Intel® DCM.