A blog on I/O I owed

I wanted to follow-up on the pre-IDF blog I wrote and what I and Sean conveyed regarding comprehensive IO optimization for enterprise cloud (based on virtualization infrastructure). A blog I owed to those who could not attend the IDF session.

In the last blog we identified 4 important vectors that drive I/O evolution.

1)      Balanced system that maps to the increases in CPU performance

2)      Scalability

3)      Unified fabric

4)      Security

In my view I state it an evolution as I feel that is the natural state things will head towards in the (near) future.

In a cloud environment you would anticipate automation and policies determine the consolidation possible on a system. If SSDs get broader adoption and virtualization performance increases due to hardware assists, I/O and fabric could become the bottleneck for the degree of consolidation and efficiency as it cannot map to the increased data rates from the storage and CPU performance.

Ways to address this is either to reduce or eliminate the overheads in the I/O stack caused by software emulation of devices in the VMM. VMDq is an example of a technology that can reduce the overhead or offload some of the VMM tasks through hardware assists in the NIC. Direct assignment with PCI-SIG SR-IOV support is a way to eliminate the overheads by by-passing the VMM. With SR-IOV, a single device can be divided into many logical devices known as Virtual Functions (like a pair or independent transmit receive queue). Each virtual function can be directly assigned to a VM using Intel VT-d thereby bypassing the VMM. This can work with Live VM Migration too. At IDF we showcased 4 demos of prototype SR-IOV software solutions running on Intel Xeon 5500 based hardware with prominent VMM vendors like VMware, Citrix and Redhat that have different hypervisor technology. The networking demos showcased working live migration with SR-IOV and VT-d based direct assignment. Direct assigned VMs could be even moved to an emulated mode and brought back to direct assigned mode. Intel has not only been working with software providers but also with other hardware vendors like LSI to demonstrate this capability. These technologies are as important to storage as networking particularly as SSDs gather steam. You can learn more from these blogs below on the demos.

Demo with Intel Xeon 5500 based Dell servers

An analyst view on the LSI solution demonstrated

If those multi 1GbE cables (that make your fabric look like pasta) be replaced by 10GbE and if SR-IOV and VT-d be used for performance, then it answers both the I/O performance and scalability requirements for a flexible datacenter.

Beyond those VT-d provides greater protection by allowing I/O devices to access only the memory regions allocated to them, and SR-IOV allows VMs to access only their portion of the device and restricts access to other Virtual Functions (owned by other VMs) on the I/O device or the entire device itself. Better security through better isolation.

Last but not the least of the requirements is the unified fabric. When IT can use a single I/O device for storage or for LAN traffic, the rigidity associated with provisioning of servers with some number of HBAs and some number of NICs is reduced. The I/O capacity becomes fungible and flexible. FCoE and iSCSI are key technologies in this direction. Adding capability to monitor QoS and shape the traffic makes it a good match for flexible datacenter.

Many of the technologies I discussed above (VT-d, SRIOV, FCoE, iSCSI) are here today…  software such as Red Hat Enterprise Linux is already delivering the solution. In my view just a matter of time that ecosystem builds further and hardware is well tuned.

With these in perspective how do you see your datacenter shaping up?