At the Core of Great Innovation — Intel Xeon Processors & EMC VMAX3 Enterprise Data Service Platform

Stu Goldstein is a Market Development Manager in the Communications and Storage Infrastructure Group at Intel

What a great way to highlight what Intel® Xeon® Processors are capable of:  the launch of the VMAX3 Enterprise Data Service Platform.

Scale UP and OUT isn’t easy but the VMAX3 makes it look so!  Its Dynamic Virtual Matrix architecture supports up to 384 Intel Xeon processor cores efficiently managing hundreds of ports and terabytes of pooled cache.  VMAX3 works seamlessly with the Intel Xeon processor’s multi-core architecture to run data and application services that ordinarily would be external to the CPU complex.  By architecting a solution that truly takes full advantage of core count, flash, and cache technology, EMC has created a product that knows how to consolidate the largest workloads and keeps capacity costs, transaction costs and cost per virtual machine considerably lower than in the past. 

Well, what is so special about Intel Xeon Processors that makes this all possible?  What has EMC taken advantage of here to create a product that is able to dynamically apportion host access, data services and storage resources to meet application service levels that are critical to next generation Hybrid Cloud deployments?

It isn’t all about the actual Intel® Xeon® Processor core architecture which at times gets all the attention;  the “uncore” functions that support the core architecture have been well-utilized by the VMAX3.  These are functions of the CPU that are not in the core but which are essential for core performance. Several architectural enhancements were made to achieve significant performance gains over the prior generations.

Improvements to memory bandwidth are considerable. A three-ring interconnect has been put in place building on prior QPI architectures to enable low latency and high throughput by establishing more efficient pathways between the Cores and the rest of the CPU complex.  This also accommodates a second “deep buffering” Home Agent/Memory Controller for increased memory bandwidth efficiency, allowing up to 512 memory transactions in flight.  This is particularly important when meeting the throughput requirements in large scale systems like the VMAX3.  This equates to a 60% memory bandwidth improvement over the prior generation which is a big deal, especially considering that I/O bandwidth performance doubles!   The CPU is now plumbed to handle faster I/O via upgrades to PCIe Gen3 and more I/O through a doubling of the number of PCIe lanes to 80 for each two socket CPU.  VMAX3 takes advantage of this increased connectivity to high speed front end servers and back end storage I/O devices.

Changes were also made to the coherence protocol, putting in place an in-memory directory that is used across all multi-socket systems.  This approach introduces a speculative snooping mechanism (Opportunistic Snoop Broadcast) to reduce directory overhead and improve latency in DP/MP systems. In addition, an IO Directory Cache has been implemented to offset the directory overhead for remote I/O accesses.

Any discussion of the Intel® Xeon® Processor can’t end without mentioning that power has also been optimized.  The improved performance fits within the same power envelope of the prior generation and idle power has been significantly reduced!  This means that compute has more than doubled, using the same power envelope.

I’m excited to see what new usage models the VMAX3 enables!  Let me know how you’ve been able to take advantage of the new capabilities by posting your reply.