Part 1 – Motivation for Intel® Optane™ Technology + 3D NAND: Emerging new memory technologies from Intel

I have been dabbling with various open source storage software and distributed high-availability models to understand the applicability of new memory technologies. I am excited to share my insights and learnings over a series of blogs. But first, why should you care about Intel memory and storage?

Moore’s Law for CPUs opens up opportunities for new storage technologies

General purpose computing has been evolving over five decades at breakneck pace following Moore’s law. As CPUs evolved, the memory has not quite kept up. As a result, a hierarchical memory with inclusive caches as an architecture evolved to manage the insatiable need for better computing experience needing bigger and faster memory.

With the dawn of the new millennium, increasing the operating frequencies meant that the CPU package temperatures would be hotter than the surface of the sun! So, the number of CPU cores per package started increasing as a natural response. Interestingly, only the levels of memory closest to the CPU remained exclusive per core, while the rest of the layers became inclusive. This shared memory hierarchy resulted in the need for high performance, random-access non-volatile storage.

Figure 1: Intel memory and storage hierarchy

I was part of the team of inventors that brought Intel’s first consumer storage segment SSD, the Intel® X25-M, to market. Since then, cores per CPU keep increasing exponentially, and every layer in the memory hierarchy is facing massive pressure to increase the performance/capacity. The answer to this problem is to innovate new memory technologies and add more layers to the hierarchy. Intel has done this with the super-fast, non-volatile Intel® Optane™ media, as well as the world’s first data center QLC PCIe SSD (Intel® SSD D5-P4320)1.

Explosive data growth – A story of growth for all memories

The world’s dataset sizes, and therefore the need for storage, continue to explode exponentially. In a hierarchical memory, the area under the pyramid needs to grow to support this explosive growth (i.e. the entire pyramid grows and the area of every layer in the pyramid also grows in absolute capacity).

Figure 2: Explosive growth in world's dataset

If the annual data size doubles roughly every two-to-three years, it implies that data is getting generated at an accelerated speed, yet the performance/terabyte of the storage layer is getting slower. Let me use NAND as an example to illustrate the challenge.

Slowing performance/capacity or faster memory for same affordability?

Figure 3: Write Bandwidth Per Die                   Figure 4: Write Bandwidth Per Terabyte

Applying Moore’s Law to NAND, the density stored per die has to double every ‘x’ months (~24 months in Figure 3 above). If we made the NAND die 2x denser, it reduces the $/GB. The performance/die typically remains constant. Using write bandwidth/TB for illustrative purposes, you can see that the performance/TB is sloping downward in a logarithmic scale. The slope should have been steeper than what is shown in Figure 4, but NAND has other innovations like multiple-page programming called multi-plane (similar in concept to multi-core CPU), that increases the performance per die.

This increased number of planes comes as an added cost and other constraints like granularity of writes, which is outside the scope of this blog. The larger trend of decrease in performance/capacity is an inevitable one. The dominant type of NAND deployment is 3 bits/cell. Intel has introduced the world’s first 4 bits/cell PCIe NVMe SSD1 to fulfill the opportunity available between the 3 bits/cell tier and what is delivered using hard disks.

Since adding more layers and bits per cell delivers more capacity per die at a lower performance, what needs to be done in the architecture to improve overall performance?

Combining fast and slow storage to create tiers

I want to explore architectures that can be flexible – you don’t need every layer in the hierarchy for every deployment. All these memory technologies are here to stay (and thrive). This is a story of growth and value creation, and not one of conquest where one technology eliminates the need for other.

With the growth of faster connectivity over the last decade, there are now at least two distinct types of data: (1) Frequently modified, update-in place data (2) Machine generated, forever lifetime data.

Traditional hierarchical data

Imagine an ‘instant power on’ of an operating system. Intel® Optane™ technology enables such possibilities. For a PC to power on instantly, the operating system data for boot needs to be in the fastest memory. Let’s use Adobe Photoshop as an example. In order to manipulate the pictures, very fast read bandwidth is necessary from the storage that holds the data. The data file in Photoshop is often megabytes in size and Intel’s 4-bits/cell NAND SSD can deliver the necessary cost-effective read bandwidth. Legacy/traditional software does not understand data tiers and it is left up to software in the operating system to apply heuristics and guess the placement of data. In future blogs, we can explore techniques to identify/classify data for placement in appropriate memories.

Machine data/Cloud native applications: Storage performance tier-aware applications

With the advent of cloud storage and cloud computing, and improvements in connectivity speeds, it is now possible for applications to target a storage tier with a specific service level capability. For example, archival and backup of pictures captured via a mobile phone vs. storage for online transactions of a business require entirely different performance and guarantees. Although it is easy to imagine arbitrary ratio of fast and slow media to create an intermediate performance storage, it is incredibly hard to build an intermediate performance tier that delivers consistent performance.

I am exploring techniques of building storage tiers that can deliver consistent performance using mixed media like Intel® Optane™ technology and Intel® QLC technology, as well as have the ability to provide tiers of performance cost effectively.

In future blogs, I will explore what I learned by building storage with Intel® Optane™ media in front of an Intel® QLC SSD (hierarchical data) and share what I discovered while building a QLC-based storage service with Intel Optane media as a part of the storage appliance, but not necessarily directly contributing to storage of user data.

1Source: Intel. Based on Intel achieving PRQ status of Intel® SSD D5-P4320 on 13 July 2018.

2 Figures 3 and 4: Source – internal Intel testing.

Performance results are based on testing as of the date set forth in the configurations and may not reflect all publicly available security updates. See configuration disclosure for details. No product or component can be absolutely secure.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit intel.com/benchmarks.

Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

Intel technologies may require enabled hardware, specific software, or services activation. Check with your system manufacturer or retailer.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

Published on Categories StorageTags , , , ,
Anand Ramalingam

About Anand Ramalingam

Anand is a Principal Engineer with NVM Solutions group who has worked in various architectural, pathfinding capacities and has been part of the SSD team since Intel’s very first X25 SSD. Anand has held various architecture, product leadership roles in taking SSDs through 7 generations of NAND lithography including planar to 3D NAND transition and SLC -> eMLC -> TLC -> QLC transitions. Anand is an Innovator of many foundational patents on algorithms, media/component behavior to uniquely differentiate technologies resulting in differentiated products. Anand is currently leading a team exploring opportunities for cost optimized QLC technology deployments.