Datacenter Dynamic Power Management – Intelligent Power Management on Intel Xeon® 5500
With newly released Intel Xeon® 5500 Processor family, it comes with a new breed of datacenter power management technology - Intel® Intelligent Power Node Manager (Node Manager in short).
As a former datacenter engineering manager, I had personal experience of the management issues at datacenters, especially dealing with power allocations and cooling – we often assumed the worse case scenario as we could not predict when the server power consumption will peak. When it did peak, we had no way to control it. It is like driving with blindfold and hope for the best outcome. The safest bet was to make the road as wide as possible - leave enough headroom for the power budget, so that we would not run into power issues. But it resuled in under utilized power, or stranded power, that is quite a waste.
Over the course of last several years, we met with many IPDC (internet portal datacenter) companies. We heard over and over again of their datacenter power management challenges, which was even worse than I experienced. Many of the IPDC companies we talked with leased racks from datacenter service providers under strict power limits per rack. The number of servers per rack they can fit had direct impact to their bottomline. They did not want to under-populate the racks, as they had to pay more rent for the same amount of servers; they could not over-populate the racks as it would be over the power limits. Their power management issues could be best summerized as the following:
· Over-allocation of power: Power allocation to servers does not match actual server power consumption. Power is typically allocated for worst case scenario based on server nameplate. Static allocation of power budget based on worst case scenario leads to inefficiencies and does not maximize use of available power capacity and rack space.
· Under-population of rack space: As a direct result of the over-allocation problem, there is a lot of empty space on racks. When the business needs more compute capacity, they have to pay more for additional racks. There are not enough datacenter spaces for them to rent. As a result, they had to go to other cities even other countries – increased operational cost and supporting staff.
· No capacity planning: There is not effective means to forecast and optimize power and performance dynamically at rack level. To improve power utilization, datacenters needs to track actual power and cooling consumption and dynamically adjust workload and power distribution for optimal performance at rack and datacenter levels.
This is where the Node Manager comes to play. Let’s take a look at what Node Manager and its companion software tool provided by Intel for rack and group level power management – Intel® Data Center Manager (DCM) will do:
Intel® Intelligent Power Node Manager (Node Manager)
Node Manager is an out-of-band (OOB) power management policy engine embedded in Intel server chipsets. Processors carry the capability to regulate their power consumption through the manipulation of the P- and T-states. Node Manager works with the BIOS and OS power management (OSPM) to perform this manipulation and dynamically adjust platform power to achieve maximum performance and power for a single node. Node Manager has the following features:
· Dynamic Power Monitoring: Measures actual power consumption of a server platform within acceptable error margin of +/- 10%. Node Manager gathers information from PSMI instrumented power supplies, provides real-time power consumption data singly or as a time series, and reports through IPMI interface.
· Platform Power Capping: Sets platform power to a targeted power budget while maintaining maximum performance for the given power level. Node Manager receives power policy from an external management console through IPMI interface and maintains power at targeted level by dynamically adjusting CPU p-states.
· Power Threshold Alerting: Node Manager monitors platform power against targeted power budget. When the target power budget cannot be maintained, Node Manager sends out alerts to the management console
DCM is software technology that provides power and thermal monitoring and management for servers, racks and groups of servers in datacenters. It builds on Node Manager and customers existing management consoles to bring platform power efficiency to End Users. DCM implements group level policies that aggregate node data across the entire rack or data center to track metrics, historical data and provide alerts to IT managers. This allows IT managers to establish group level power policies to limit consumption while dynamically DCM provides allows data centers to increase rack density, manage power peaks, and right size the power and cooling infrastructure. It is a software development kit (SDK) designed to plug-in to software management console products. It also has a reference user interface which was used in this POC as proxy for a management software product. Key DCM features are:
· Group (server, rack, row, PDU and logical group) level monitoring and aggregation of power and thermals
· Log and query for trend data for upto one year
· Policy driven intelligent group power capping
· User defined group level power alerts and notifications
· Support of distributed architectures (across multiple racks)
What the combination of DCM and Node Manager will do to datacenter power management? Here is the magic part… With the DCM at group and rack level setting policies, Node Manager can dynamically report the power consumed by a server and adjust it within certain range, so that the overall power consumption of the rack or a particular server group could be managed within a given target. Why this is important? Let me use a real example to explain it:
IPDC Company XYZ (a name I cannot disclose in public) has a mission critical workload at their datacenter that runs 24x7 and there are workload fluctuations during the day. The CPU utilization is mostly at 50~60%, with few cases that it will jump to 100%, typical for datacenter operations. To be on the safe side, the current solution is to do a pre-qualification of the Xeon® 5400 server for the worst case at 100% CPU utilization which ran at ~300W. They used 300W for power allocation, which was considered significantly lower than the nameplate value of the power supply (650W).
With Xeon® 550, for the same workload at 100% throughput, the platform power consumption goes down to 230W, a 70W reduction from the previous generation CPU – a good reason to switch to a new platform due to the advance intelligent power optimization features on Xeon® 5500. But the story does not end there…
On top of that, we further analyze the effect of power capping using Node Manager and DCM. After many tests, we noticed that if we cap at 170W and the performance of impact for workload at 60% CPU utilization and blow is almost negligible. This means, that we 170W power capping, the platform can deliver the same level of services most of the time, with 50W less (230W-170W) power consumption. For occasional spike that is above 60% CPU utilization, there will be some performance impact. However, since the Company XYZ operates at below 60% CPU utilization most of the time, the performance impacts are tolerable. As a result, we can squeeze more power from the power allocation using the dynamic power management feature of Node Manager and DCM.
What does this mean to the Company XYZ? Well, we can do the math. The rack they lease today has the limit of 2,200W/rack. With the current Xeon® 5400 servers, they can put upto 7 servers per rack at 300W per server. With Xeon® 5500, they can safely put 9 servers at 230W per server – a 28% increase of the server density on the rack. Top it up, by using Node Manager and DCM to manage the power at rack level with power limit of 2,200W and dynamically adjust the power allocation among the servers, we can put at least 12 servers at an average of 170W power allocation per server – a 71% increase of the server density comparing with the situation today! This means a great saving for the Company XYZ. In this case, the power consumption of each server on the rack could go above 170W, or lower than 170W. DCM dynamically adjusts the power capping policy while holding the line for entire rack power consumption below 2,200W.
Of course, the power management result varies from workload to workload. There has to be workload-based optimization in order to achieve the best result. Also, we assume that the datacenter should be able to provide sufficient cooling for devices that consume power within the given power limit. Even though, the result we get from this test could not be applied universally to all IPDC customers, we have finally had a platform that can dynamically and intelligently monitor and adjust the platform power based on workload. For datacenter managers, you can manage power at rack level and datacenter level with optimized power allocation to fully utilize the datacenter power. Are you ready to give it a try?