24 months of Intel SSDs…. What we’ve learned about MLC in the enterprise…
The Enterprise Integration Center (EIC) private cloud lab (a joint Intel IT and Intel Architecture Group program) has been working with Intel SSDs (solid state disks) for the last two years in a number of configurations ranging from individual boot/swap volumes for servers to ultra performance iSCSI software based mini-SANs. So, what have we learned about performance, tuning, and use cases?
There are plenty of industry resources and comparisons available out at any number of trusted review sites, but most of these revolve around client usage and not server/datacenter uses. From my contact with industry, most engineers seem to think that using an SSD in the datacenter requires a SLC NAND device (Single Level Charge - Intel X25-E product) due to endurance requirements. For those new to NAND characteristics, endurance (usable lifetime) is determined by writes to the NAND device as block-erase cycles stress and degrade the ability of the flash cells to be read back. Basically, SLC devices last through more block-erase cycles than their less expensive and larger capacity MLC cousins (Multi Level Charge - Intel X25-M product). The assumption that ‘only SLC will do’ for the enterprise raises the $/GB cost flag and mires discussion. Endurance is the number one, “but those won’t for my use-case” argument.
The EIC cloud lab has some good news here, lower cost MLC or consumer grade devices can do just as well, especially in RAID arrays. To get the best out of these MLC devices though, we have to employ a few techniques that allow the drive and its components to function more efficiently. These techniques manipulate the three vectors in MLC… space, speed, and endurance by altering the useable size of the disk.
Assume I have a 160 GB X25-M MLC drive; this device is spec’ed at 250MB/s read and 100MB/s write (sequential) and has a lifetime of around 4-5 years in a ‘consumer’ use case (laptop-desktop). So if I was to use this same device as a repository for a database transaction log (lots of writes), the lifetime would shorten significantly (maybe as little as a year). There are specific formulas to determine endurance & speed, some that are unavailable to the public, but Principal Engineer Tony Roug wraps up the case for MLC in the enterprise quite well in this presentation from Fall 2010 Storage and Networking World.
Back to trade offs (space, speed, and endurance); my 160GB MLC drive won’t work for my database transaction log because the workload is too write intensive… What I can do about this is to take the 160GB drive and modify it to use only 75% (120GB) of the available capacity. Reducing the ‘user’ available space gives the wear-leveling algorithm in the drive more working room and increases both the speed (write speed as reads are unaffected by this) and the endurance of the drive, but also increases the $/GB as you have less available space.
With the ‘user’ space reduced to 120GB (over-provisioned is official term), that same 160GB is now capable of 250MB/s read and 125MB/s write (sequential) and has a lifetime of 8-10 years in the ‘consumer’ use case. Not terribly appealing to the average end-user who just spent $350 on an SSD as they lost 25% of the capacity, but in the performance and enterprise space this is huge. Once modified, my ‘consumer grade’ MLC drive gets roughly 75-80% of the speed & endurance of the X25-E SLC drive with 4x the space at about the same ‘unit cost’ per drive. Since the drive is 4x larger than SLC, will likely last as long as a standard hard disk once over-provisioned, has great throughput at 125-250MB/s, and can reach 100-400x the IO operations of a standard hard drive we can now begin the discussion around which particular enterprise application benefit from Intel MLC SSD.
For the enterprise, once we overcome the endurance hurdle, the value discussion can begin. For the performance enthusiast at home, this same technique allows a boost in disk write throughput, higher benchmark scores, and of course more FPS (frames per second) in whatever game they are thoroughly stressing their over-clocked water-cooled super-system with at the moment.
BKMs (Best Known Methods) for enterprise and use-case evaluation… AKA: The technical bits…
- Get to know the IO characterization (reads/writes) of the target application & use case
- Baseline the application before any SSD upgrades with standard disks, collecting throughput and utilization metrics
- Knock a maximum of 25% off the top of any MLC drive you’re using in the datacenter
- More than 25% has diminishing value
- Use either an LBA tool, RAID controller, or partitioning tool after a fresh low level format
- That % can be smaller based on the write intensity of the target application - less writes = less % off the top on a case by case basis
- SAS/SATA RAID controller settings
- Activate on-drive cache – OK to do in SSD
- Stripe size of 256k if possible to match block-erase cycle of drive
- Read/write on-controller DRAM cache should be on and battery backed
- Make sure any drive to controller channel relationship in SAS controllers stays at 1:1
- Avoids reducing drive speed from 3.0 Gbps to 1.5 Gbps
- Avoid using SATA drives behind SAS expanders
- Again… avoids reducing drive speed from 3.0 Gbps to 1.5 Gbps
- SSDs are 5v devices, make sure the 5v rail in the power supplies has a high enough rating to handle to power-on of X number of SSDs
- Only necessary if you’re putting 16+ drives in any particular chassis
- Baseline the application after SSD upgrade to determine performance increase collecting throughput and utilization metrics
- Look for higher IOPS and application throughput but also be looking for higher CPU utilization numbers now that you have eliminated the disk bottleneck from your system
- There will likely be a new bottleneck in other components such as network, memory, etc… look for that as a target for your next improvement
- Last but not least, when testing an application you’ll need to ‘season’ your SSDs for a while before you see completely consistent results
- For benchmarks, fill the drive 2x times completely and then run the target test 2-3 times before taking final measurements
- For applications, run the app for a few days to a week before taking final performance measurements
- Remember, a freshly low level formatted SSD doesn’t have to perform a block-erase cycle before writing to disk
Well, that’s it in a fairly large nutshell… We see using MLC disks in enterprise use cases as something that is growing now that the underlying techniques for increasing endurance are better understood. In addition, as Intel’s product lines and individual device capacities expand… so can enterprise use cases of these amazing solid-state disks. The question left to answer is, “In your datacenter, are there applications and end-users you can accelerate using lower cost MLC based Intel SSDs?”