The Path to Next-Gen Hyperscale Data Centers: Intel Rack Scale Architecture

In their efforts to adapt to the demands of the digital  economy, the Internet of Things, and other disruptive changes, data centers are  facing big technical challenges in terms of flexibility and scale. This is all  because of traditional rigid architectures.

Today’s hardware infrastructure for data centers typically comes  as preconfigured 1U or 2U servers with individual processors, memory, I/O, and  network interface controller (NIC). To upgrade or add to this infrastructure, a complete system needs to be built and integrated into the rack, and connected  via management and virtual pooling. This system will essentially operate as a  single unit of compute, meaning its internal resources of CPU, memory, and  dedicated storage are accessed solely by that server, locking down resources that  are not always fully utilized.

To complicate the challenges, the conventional server  architecture is in general a vertical deployment model, with many different hardware/software  models present for management. So how can you overcome rigid, expensive,  time-consuming data center build-outs that can’t keep pace with the digital  demands of today? The answers are already here—in the form of disaggregation of  the data center rack.

With this new approach to the rack, a logical architecture  disaggregates and pools compute, storage, and network resources and provides a  means to create a shared and automated rack architecture that enables higher  performance, lower cost, and rapid deployment of services. At this point, agility  at hyperscale is no longer a distant dream. Add in analytics-based telemetry  exposed on the disaggregated management controller and you have the foundation  for a new logical architecture—a rack-level system.

This new logical architecture is available today in the form  of Intel®  Rack Scale Architecture. This architecture exposes a standard management  framework via REST APIs to discover and retrieve the set of raw components—like  drawers, blades, disks, and pooled resources like processors, memory, and NVMe  disks—in a rack and collectively in a pod. These resources can be provisioned  by a separate management network to compose a compute node or storage node.

In addition, a telemetry model is supported that exposes  capacity, capability, and bottlenecks at each component level, thus allowing  the right hosts to be composed for orchestrating a workload. Separation of the  data and virtual management plane from the hardware provisioning and management  plane and telemetry with analytics enables resources such as storage, memory,  and compute to be added as needed, creating flexibility and scalability that  can be fully utilized.

Of course, the success of this new logical architecture depends  on the creation of open standards for the configuration and management of the  rack components—such as compute, storage, network, and rack management  controllers. These standards allow IT organizations to connect various hardware  components together to form software-defined systems that more effectively utilize  all the hardware in a disaggregated rack.

Intel pioneered the rack scale concept, working closely with  key partners, OEMS, and standards bodies, such as DMTF Redfish. The players in  these collective efforts recognized the importance of working with standards  bodies to enable interoperable hardware, firmware, and software.

The resulting Intel Rack Scale Architecture is the result of  an open effort that allows Intel partners and solution providers to innovate  and create diverse solutions to give customers many different choices. At the  same time, the open approach establishes a platform for innovation at various  levels in the data center market. It allows for the architecture to evolve over  time based on hardware innovation and changing customer use cases.

At a personal level, the evolution of Intel Rack Scale  Architecture is particularly gratifying, given that I have been part of the  team that worked on this effort from its earliest days. We set out with a focus  on reducing TCO and meeting other business-driven objectives, and now we are  well on our way to achieving the vision, thanks in a large part to a tremendous  amount industry support. Already, many major OEMs are releasing products based  on Intel Rack Scale Architecture and are delivering innovative new designs to their  customers and end users.

Looking ahead, here is some of what we see on the horizon:

  • The ongoing disaggregation of compute, I/O,  memory, and storage, which will give data center operators the ability to  upgrade the different components independently of each other, everywhere in the  data center
  • The evolution to disaggregated NVMe-as-storage  solutions, pooled FPGA, and disaggregated networks delivered as solutions  architected in rack scale
  • The development of an agile orchestration of  hardware layer in open source solutions like OpenStack
  • The use of high-speed interconnections between  components with less copper and more prevalent optical/wireless technologies,  along with more security and telemetry at every level to drive more efficient  use of resources

If you happen to be at the OpenStack Summit in Austin this  week, you can catch multiple presentations on Intel Rack Scale Architecture  along with demos at the Intel booth.

And if you’re ready for a technical deep dive at this point,  you can explore the details of Intel Rack Scale firmware and software  components on GitHub: