This year has been one for the history books. Given the trials and tribulations we’ve all endured, it’s increasingly important for companies like Intel and Nutanix to continue working together to move innovation forward. To this end, over the last year Intel strengthened its 5+ year partnership with Nutanix to launch a joint innovation lab. The introduction of Blockstore and support of Intel® Optane™ SSDs with Nutanix AOS 5.18 software are among the first projects to graduate from this lab. These storage enhancement features deliver up to 2X the performance improvement over previous AOS versions with SATA based SSDs.1 Read on to learn the details.
Moving the Block Management Layer Into User Space
The first puzzle piece to unlocking new levels of performance is something our friends at Nutanix engineered. Let me give you a bit of background.
Located on each node in a Nutanix cluster is a Controller Virtual Machine (CVM) which contains Stargate, the I/O data manager of AOS. This program exists in user space, but legacy versions of AOS utilize the common EXT4 filesystem, which exists in kernel space.
Nutanix saw an opportunity to reduce latency in the data path by creating a filesystem that also exists within user space and integrating it into the CVM. Moving the filesystem and block management layer into user space removes the use of costly context switches between user space and kernel space, thereby reducing latency.
The release of AOS 5.18 introduces this new filesystem, BlockStore, for general availability and we know you’ll appreciate the reduced latency.
Purpose-Built Storage Driver for NVMe
Based on previous testing of AOS, the results from BlockStore were already impressive! But what if we could do more?
Now that the filesystem and block management layers had been moved into user space, it was time for a bit of Intel-developed, open-source magic. The Storage Performance Development Kit (SPDK) is Intel open source code made freely available to unlock the performance potential of today’s NVMe devices. There’s a lot included in this kit and I’d urge you to visit spdk.io to learn more, but I will cover a couple of features that Nutanix implemented.
Traditionally, the CVM utilized the SCSI subsystem to communicate with storage devices, which requires a context switch to kernel space. In order to completely remove these switches from the storage data path, SPDK eliminated the need for the SCSI subsystem by utilizing Direct Memory Access (DMA) from user space.
Other tools from SPDK are also used, but among them is the transition to polling instead of interrupt requests (IRQs). As devices have become far faster (e.g. NVMe, Intel Optane SSDs, Intel® Optane™ PMem), interrupt handler and device interaction has become a bottleneck. SPDK polls devices for completions instead of waiting for interrupts and eliminates this bottleneck by avoiding the significant overhead due to forced context switches. In the days leading up to .NEXT 2020, we continue to discover new ways to optimize and squeeze out even more performance.
Intel and Nutanix engineers worked late nights and some weekends implementing SPDK to further reduce context switches within the CVM and introduce new methods for reducing latency in the I/O path. There were bumps along the way and issues to debug, but the end result is worth it. With the combination of BlockStore + SPDK, all Stargate device interaction has moved into user space eliminating any context switching or kernel driver invocation.
As of AOS 5.18, SPDK is considered a technical preview and will be generally available in an upcoming release of AOS.
Taking Advantage of Unparalleled Performance
The final ingredients to this potent performance cocktail are Intel Optane SSDs and Intel® 3D NAND products. The need for both BlockStore and SPDK implementation stems from the launch of All NVME Nutanix platforms, now equipped with Intel Optane SSDs. Intel Optane SSDs utilize a unique cross-point structure constructed from perpendicular conductors which connect 128 billion densely packed memory cells. Each memory cell stores a single bit of data that can be changed by controlling the voltage sent to each memory cell’s selector. The dense packing of the memory layers of Intel® Optane™ media are also scalable since they can be stacked in a 3-dimensional manner. This compact, transistor-less structure combined with advances in materials science creates the high performance that Intel® Optane™ technology is known for.
A previously high performing Nutanix cluster using SATA SSDs with random read latencies of 103 µs and random write latencies of 54 µs, whereas Intel Optane SSDs have consistent random read and write latencies of between 10-12 µs.2 To take full advantage of these low latency, high performing NVMe devices, Nutanix had to trim down the total latency inherent in their I/O data path.
Equipped with the latest and greatest hardware and software, I’m proud to show off how working together can really take performance to the next level.
For details on the performance improvement that Intel Optane SSDs and Intel 3D NAND SSDs combined with BlockStore and SPDK can offer you, please check out our session named “It’s Here! Intel® Optane™ SSD with Nutanix” at https://www.nutanix.com/next.
Intel tested, Aug 20, 2020. Workload: 70/30 Random Read/Write Workload on a synthetic benchmark of Xray. 4 VMs, 1 VDisk, QD 4. See configuration below: