Getting the Most Out of Your vSphere Tiered Memory Configurations

It is not news that the explosive growth of data, the use of data-intensive applications like artificial intelligence and machine learning, and the need to process data in real time are increasing the demand for larger memory. Because DRAM density is not increasing at the same rate as the growth in demand, memory is becoming a very significant portion of a VMware vSphere server’s bill of materials, creating a cost pressure on data centers.

The deployment of a memory hierarchy is an economical solution to this problem, taking advantage of data locality and the trade-offs in the cost-performance of memory technologies. However, the introduction of new memory technologies is not an easy feat. Intel® Optane™ persistent memory (PMem) is the first commercially viable and scalable new memory technology introduced since NAND Flash Memory was introduced in 1989. Intel Optane PMem provides the density and speed necessary to enable the implementation of a tiered memory solution.

The deployment of Intel Optane PMem in Memory Mode transforms the single-tier memory of the vSphere host into a tiered memory. In this configuration, DRAM serves as the memory cache for the hot data (tier 0), while Intel Optane PMem serves as a large-capacity, affordable main system memory (tier 1). This two-tier memory system is simple to implement and transparent to VMs and VMware vSphere operations.

Intel Optane PMem is offered in 128 GB, 256 GB, and 512 GB DIMMs, and is generally far less expensive per GB than DRAM. At current prices, the $/GB for Intel Optane PMem is at least 70 percent less expensive than DRAM, creating opportunities for host memory optimization.[i] For example, you can deploy 1 TB of tiered memory for 31 percent less cost than a traditional 768 GB DRAM deployment.[ii] Or, you could deploy 2 TB of tiered memory for nearly the same cost as a traditional 1 TB DRAM deployment.[iii] Overall, a tiered memory system using Intel Optane PMem can result in a nearly 20 percent reduction on the cost of a fully configured host with 2 TB of memory.[iv]

Getting the Most Out of Your Tiered Memory Configurations with Memory Monitoring and Management

Tiered memory was introduced with VMware vSphere 6.7. However, many data center architects are not yet comfortable with this relatively novel approach to memory configuration. Two primary questions arise:

  1. How do I configure the tiered memory?
  2. How do I monitor and manage the tiered memory?

These questions are important, because determining the right size of tier 0, or DRAM cache, and ensuring that the utilization guidelines are kept over time, are key elements of optimizing cost and performance.
The first question, regarding configuration, has a simple answer: VMware vSphere systems provide an Active Memory counter that corresponds to the Hot Area of a host. The historical Active Memory information is easily obtained from vSphere, either manually or with sizing tools, and can be used as the basis for determining the size of the DRAM tier. The general rule is that the Active Memory used by a host should fit within the DRAM footprint.

But until now, it was difficult to answer the question regarding monitoring and managing tiered memory. This is because prior to the release of VMware vSphere 7.0U3, few tools were available to monitor tiered memory environments. It was difficult to determine which hosts had tiered memory, how much DRAM cache was left or in use, how much DRAM cache was being used by a VM, or what the read cache miss rate was. However, with VMware vSphere 7.0U3, these questions are more easily answered, because this version provides instrumentation to monitor and manage a tiered memory system.

The new VMware vSphere Memory Monitoring and Remediation (vMMR) tools are fully integrated to the VMware vCenter user interface (UI). vMMR facilitates the verification of whether the host has memory tiering configured and provides details of the configuration, as can be seen in this screen capture.

vMMR Tool Screencapture vMMR Tool Screencapture Memory

When monitoring the performance of a host, it is now possible to correlate Active Memory (hot area), DRAM and Intel Optane PMem bandwidth utilization, and DRAM miss rates (that is, how frequently the system is accessing data from the Intel Optane PMem tier). The following screen capture (accessed through the vSphere Client UI by clicking Host  Monitor  Overview  Choose Memory tab) shows an example of the type of information that is available from vMMR.

vMMR Memory Performance Dashboard

Data center administrators can use this information to determine whether the host is operating within the recommended guidelines for best performance and use the historical information for capacity planning. This helps to determine when additional DRAM and Intel Optane PMem capacity will be needed or identifying whether further optimization of the DRAM-to-Intel Optane PMem ratios is possible.

An important characteristic of vMMR being integrated with vSphere is that it also issues health alarms if the DRAM occupancy exceeds the default threshold or if the Intel Optane PMem bandwidth exceeds the DRAM bandwidth on a particular virtual machine (VM). In these cases, the information from vMMR can help identify which VMs are consuming most of the DRAM and/or Intel Optane PMem footprint and bandwidth, and determine if a VM should be migrated to another host to rebalance the cluster load.

The screen capture below illustrates that it is possible to customize the columns of the VMs tab (1 and 2), add the counters needed to evaluate memory utilization (3), and sort by Active Memory (4). This quickly identifies the top memory consumers (potential noisy neighbors), which can be migrated to another host.

vMMR VM Allocations Screencapture

Important: The migration of VMs with large memory capacities is much faster since vMotion was refactored as a part of VMware vSphere 7.0, which changed from 4 KB to 1 GB pages for memory transfers.

In conclusion, vMMR provides an easy-to-use and comprehensive set of tools that can be used to monitor and manage a tiered memory implementation. For more information, watch the Intel Optane PMem for VMware vSphere vMMR VMworld video and visit the Intel Optane technology for VMware solutions page.

________

[i] Prices obtained from https://www.dell.com/en-us/work/shop/cty/pdp/spd/poweredge-r750/pe_r750_14794_vi_vp?configurationid=19732e1e-1fe3-419e-8b40-d091f06c3f51 as of November 1, 2021.
Prices change frequently. Your costs and results may vary.
32 GB DRAM = USD 32.63/GB; 64 GB DRAM = USD 33.43/GB; 128 GB Intel® Optane™ PMem = USD 8.36/GB[ii] 768 GB DRAM versus 1 TB tiered memory price comparison as of October 14, 2021 at https://www.dell.com. Pricing varies over time.
DRAM configuration: Quantity of 24x 32 GB RDIMMs at a unit price of USD 1,044.36, for a total of USD 25,064.64.
Tiered memory configuration: Quantity of 16x 16 GB RDIMMS at a unit price of USD 553.64, for a total of USD 8,858.24 plus 8x 128 GB Intel® Optane™ persistent memory 200 series DIMMs at a unit price of USD 1,069.54, for a total of USD 8,556.32. The sum of USD 8,858.24 and USD 8,556.32 is USD 17,414.56.[iii] 1 TB DRAM versus 2 TB tiered memory price comparison as of October 14, 2021 at https://www.dell.com. Pricing varies over time.
DRAM configuration: Quantity of 32x 32 GB RDIMMs at a unit price USD 1,044.36, for a total of USD 33,419.52.Tiered memory configuration: Quantity of 16x 32 GB RDIMMS at a unit price of USD 1,044.36, for a total of USD 16,709.76 plus 16x 128 GB Intel® Optane™ persistent memory 200 series DIMMs at a unit price of USD 1,069.54, for a total of USD 17,112.64. The sum of USD 16,709.76 and USD 17,112.64 is USD 33,822.40.

[iv] Testing by Intel as of May 10, 2021. Based on 280 VMs - 4 vCPUs per VM, 8 GB MEM, 125 GB usable storage capacity, up to 1,500 IOPS per VM running a 70/30 32 KB I/O load. Overheads and optimal utilization levels were considered in calculations. Results may vary. CPU cost estimated.

New Configuration: 4 nodes, 2x Intel® Xeon® Gold 6348 processor, (28 cores, 2.6 GHz), total memory = 256 GB (16 slots/32 GB/3200 MT/s), Intel® Hyper-Threading Technology = ON, Intel® Turbo Boost Technology = ON, 2x Intel® Optane™ SSD P5800X (cache) 400 GB and 8x Intel® SSD D7-P5510 3.84 TB (capacity), 1x Intel® Ethernet Adapter E810C 100 GbE, BIOS = 2.1 (ucode = 0x8d055260), VMware vSphere 7.0U2, vSAN 7.0U2, HCIbench 2.5.3, 8x VMs per host, 2x 150 GB vDisks per VM, 100% WSS.

Baseline Configuration: 4 nodes, 2x Intel® Xeon® Gold 6248 processor (20 cores, 2.5 GHz), total memory = 384 GB (12 slots/32 GB/2933 MT/s), Intel® Hyper-Threading Technology = ON, Intel® Turbo Boost Technology = ON, 2x Intel® Optane™ SSD DC P4800X (cache) 375 GB and 8x Intel SSD D7-P5510 3.84 TB (capacity), 1x Intel Ethernet Adapter E810C 100 GbE, BIOS = 2.1 (ucode = 05003003), VMware vSphere 7.0U2, vSAN 7.0U2, HCIbench 2.5.3, 8x VMs per host, 2x 150 GB vDisks per VM, 100% WSS.