Dual Port Reloaded

Dual port Intel® Optane™ DC SSDs boost performance and availability for the Dell EMC PowerMax platform.

One of the favorite parts of my job is performing thought experiments. They are fun, they stretch my imagination, and, if I’m lucky, I uncover insights that lead to innovative Intel products to delight customers, like Dell EMC.

I had such a thought experiment recently on the seemingly mundane topic of dual port storage. What would I do if I needed the performance properties of a dual port Intel® Optane™ DC SSD, but didn’t have one? I could expand this question to consider multiple SSDs, each with n ports, but that hurts my head, so I’m sticking to one SSD with two ports.

First, I had to answer why customers would want a dual port SSD. Dual port SSDs allow two hosts to connect to one SSD (Figure 1), with data shared by design. The goal of dual port design is that if one host hits a glitch, the other host has the ability to access data and maintain availability. And when a path (cables, switches, clocks) to the SSD hits a glitch, the SSD can recover fast.

Once we understand the essence of the value of dual ports, it is easy to answer my original question. I would use two hosts, each with lots of DRAM, and I would continuously copy all the data from the first host to the second one. I can use a third host, but that would be less optimal.
The latency of this two-host DRAM solution would be in single-digit microseconds (DRAM > host controller > host controller > DRAM).

The dual port Intel Optane DC SSD solution would have slightly higher latency, but it would still be measured in only tens of microseconds for our 1st generation product, and we’re aiming for single-digit microseconds in the next generation (DRAM > host controller > device controller > Intel Optane DC SSD).

Because there are I/O controllers (which set the lower bound of latency) in the path, Intel Optane technology and DRAM operate on a level playing field.

The difference between the DRAM and Intel Optane DC SSD scenarios lies in the system usage for each data path. The DRAM solution requires 2x the bandwidth across all of its system components, compared to the dual port Intel Optane DC SSD solution. When a host hits a glitch, the dual port Intel Optane DC SSD takes no more than a few seconds to recover from a cold reset, whereas a host system cold boot can take tens of seconds. (Figure 1. Thought experiments on dual port storage with DRAM versus Intel® Optane™ DC SSDs.)

And that doesn’t even take into consideration the cost, density, and persistence TCO advantages for Intel Optane DC SSDs over DRAM.

The mundane is mundane no more.

Reaching New Performance Levels with Intel Optane Technology

Our long time partners, Dell EMC, must have seen this years ago. As a pioneer of high performance and high availability systems, Dell EMC PowerMax once again leads the industry with a high-end, scale out storage platform to ship Intel Optane Storage Class Memory (SCM) drives4.

But to unleash the full benefits of new technology like this takes vision, expertise, and moxie. I can’t just put the latest 300 HP JCW* engine into my 2005 Mini Cooper*, take it to the race track, and expect immediate results. Similarly, we shouldn’t expect performance prowess at a systems level to come easily.

Dell EMC has built a scale-out architecture based on end-to-end NVMe* architecture that consists of NVMe* flash and NVMe* storage class memory (SCM). They have optimized its hardware and software and delivers performance up to 10 Million IOPS, 150GB/s bandwidth and consistent response times of 290 microseconds4.

Such low latency at high IOPS is the unique performance benefit that Intel® Optane™ SSD DC D4800X brings. Additionally it is consistently reliable quality of service (QoS) across loads and across drive capacities, as the graph in Figure 2 demonstrates. Noteworthy is the average latency (< 140us), up till full load, for a 25/75 percent read/write workload5.

Figure 2. The Intel® Optane™ SSD DC D4800X demonstrates predictable, low latency, even under load

I am particularly fascinated with how the latest PowerMax made use of a real-time machine learning engine to achieve these levels of performance. Their engine is designed to automatically place data on the correct media type (flash or SCM), forecasting where it is needed for the most performance-intensive applications and the most active data sets, analyzing 40 million data sets per array, delivering over 6 billion decisions per day4.

And not just unprecedently fast, but still with uncompromising resiliency. This is where the Intel Optane SSD DC D4800X plays a unique part.

The goals of high availability can be simply stated:

1. When things go right, perform optimally & predictably
2. No single points of failure
3. Continuous system-monitoring to predict and prevent issues
4. Ensure repairs can be made without taking system offline
5. Secure data, under any condition

According to Dell, PowerMax arrays are architected for six-nines (99.9999 percent) availability. To reach that level, Dell EMC introduced redundancy at all levels, including storage back end, cache memory, fabric, power supplies, and SSDs2. There must not be a single point of failure in their system. For Dell EMC and its customers, the dual-ported nature of the technology is therefore critical.

Delivering to these goals is non-trivial. We have worked alongside Dell EMC engineers for as long I can remember. I can say firsthand that we collaboratively work to improve system designs. As Intel engineers, we have architected and validated dual port SSDs to meet the exacting needs of our enterprise customers. Here are a few noteworthy features below:

Dell EMC customers new use cases demand unrelenting response times in the hundreds of microseconds. [Ref 3] notes the SQL server databases benefits from both server-side cache and flash storage. However, not all accesses can be served by the database cache. As such, a low latency, high capacity persistent tier of storage, if managed well, is the key to bridging the gap between flash and DRAM3.

By combining an innovative system design and sophisticated algorithms, Dell EMC was able to improve PowerMax latencies by up to 2x over legacy4. Achieving 2x improvement on its first-generation release, perceivable at the systems level, while still bearing the full burden of six-nines availability, is truly exemplary.

I see dual port as an enabling advantage instead of a legacy burden. Dual port gives Intel Optane DC SSDs a unique multipath advantage that no DDR-attached or NVDIMM drive offers. For the moment, we may use dual port NVMe SSDs as drop-in SAS SSDs replacement. This is just the beginning. The dual port Intel Optane DC SSD is not just a faster SAS drive. It’s not a me-too play. It’s a performance play based on Intel® Optane™ Memory Media that is inherently 100x6 lower latency than flash. We have only just begun to expose the value of this new technology. Dell EMC has blazed the trail and offered the world a glimpse of what is possible. We continue to work side-by-side with Dell EMC, and I cannot wait for the next breakthrough.

_______________________

Footnotes and Sources

1.  Source: Intel.com – Intel® Optane™ SSD DC D4800X product brief, product page
2.  Dell EMC. “Dell EMC PowerMax Reliability, Availability, and Serviceability Technical White Paper.” October 2018.
3.  Dell EMC. “Dell EMC PowerMax Storage for Mission-Critical SQL Server Databases.” June 2018.
4.  Dell EMC Powermax family Data sheet - Redefining modern storage, 2018.
5.  Source: Intel. System configuration – Intel® Optane™ SSD DC D4800X, CPU: Intel® Xeon® Gold 6154 CPU @ 3.00GHz, OS: Centos 7.5, Kernel: 4.14.74, BIOS: SE5C620.86B.00.01.0014.070920180847, FIO: 3.5.
Performance results are based on testing as of April 26, 2019 set forth in the configurations and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure.
6.  Source: Intel Fact sheet: “New Intel Architectures and Technologies Target Expanded Market Opportunities” December 12, 2018.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks.

Published on Categories StorageTags , , ,
Annie Foong

About Annie Foong

Annie Foong, Senior Principal Engineer at Intel and Chief Architect for Intel Optane DC SSDs, graduated with a Ph.D. in Computer Engineering from the University of Wisconsin, and been with Intel since. She joined NSG when challenged her to optimize systems for extreme low latency storage. What she thought was a 2-year task turned out to be her dream job for the last decade and counting; working with the best in our industry on something meaningful.