I continuously think about the endurance aspect of our products, how SSD users understand it and use it for its positive benefits. Sadly, endurance is often underestimated and sometimes overestimated. I see customers buying High Endurance products for the benefit of protection, without understanding the real requirements of the application. Now that piece of night thoughts goes to my blog..
How do you define the SSD endurance?
By definition, endurance is the total amount of data that can be written to the SSD. Endurance can be measured in two different ways:
- First called TBW – terabytes written, which exactly follows the meaning, total data amount during life span. It’s estimated for every SSD SKU individually even within product line.
- Second way is DWPD – drive writes per day. This is multiplier only, same for all SKUs in the product line. By saying DWPD=10 (high endurance drive), we mean the TBW = DWPD * CAPACITY * 365 (days) * 5 (years warranty). That looks to be simple math, but that’s not just it… It uses another dimension - time. I’ll explain this later.
Three main factors affect the endurance.
- NAND quality. It’s measured in the number of Programming/Erase cycles. Better NAND has higher count. High Endurance Technology NAND is used in the x3700 Series product families. So, the NAND between the S3700 and S3500 Series, for example, is physically different. Please, take a moment to learn more about Validating High Endurance on the Intel® Solid-State Drive white paper.
- Workload, different workload pattern, such as random big block or random small block writes, can have the difference on endurance up to 5x. For data center SSDs we’re using JESD-219 workload set (the mix of small random I/o to big blocks) which represents the worst-case scenario for customer. In reality, this means in most of the usage cases, customers will see better endurance in his environment.
Real life example:
Customer says he uses the drive as a scratch/temp partition. He thinks he needs the highest endurance for the SSD. Do you agree the SCRATCH use case (even with small blocks access) is worst I/O scenario? Notat all :), First, it’s 50/50 R/W mix, everything we write, will be read after. However R/W ratio is not a significant factor in workload nearly as much as random vs sequential. In this case, scratch files are typically saved in a small portion of the drive, and without threading are sequential. Even small files are “big” to an SSD.
- Spare Area capacity. Bigger spare area allows the SSD to decrease Write Amplification Factor. WAF is the ration of amount of data writes to NAND to the amount of data host writes to SSD. Target to 1 if the SSD controller doesn’t have the compression. But it can never be the one due to NAND structure – we read the data in sectors, write in pages (multiple sectors) and erase in blocks (number of pages). That’s HW limitation of the technology, but genius engineers were able to control it in a FW and make WAF of Intel SSDs lowest in the industry.
Firmware means a lot, does it?
Of course, on top of these three main influencers we add FW tricks, optimizations, and the reporting. Two similar SSDs from different vendors never are the same if they have different FW. Let’s have a look at the features of our FW:
- SMART reporting – common for the industry. Allows seeing the current status of the drive, errors, endurance (TBW to the date) and remaining lifetime. That’s what every vendor has and absolutely every user needs for daily monitoring.
- Endurance Analyzer – very special FW feature of Intel DC SSDs. Allows to forecast expected lifetime based on the user workload. Works simple - you reset specific SMART attribute timer, run your workload for few hours better days, and then read another SMART value which tells you estimated life time in days/months/years of exactly that SSD and exactly your workload. That’s the amazing advantage of our products.
How to run Endurance Analyzer?
Definitely it’s not the rocket science, let me point to this document as the reference. There are some hints, which will help you to go through that process easier. Endurance Analyzer is supported on both Intel Data Center SSD product families – SATA and PCIe NVMe SSDs such as P3700/P3600/P3500. For the SATA case you need to make sure you can communicate to the drive by SMART commands. That can be the limitation for some specific RAID/HBA configurations where vendors don’t support pass through mode for AHCI commands. In such cases a separate system with SATA ports routed from PCH (or other supported configuration) should be used. Next, you need correct SW tool, which is capable to reset required timer. We’re some open source tools, but I advise to use Intel SSD Data Center Tool which is cross-platform, supports every Intel DC SSD and can do lot more than basic management tools. Here are the steps:
1. Reset SMART Attributes using the reset option. This will also save a file that contains the base SMART data. This file is needed, and used, in step 4 when the life expectancy is calculated.
isdct.exe set –intelssd # enduranceanalyzer=reset
2. Remove the SSD and install in test system.
3. Apply minimum 60-minute workload to SSD.
4. Reinstall SSD in original system. Compute endurance using the show command.
isdct.exe show –a –intelssd #
5. Read the Endurance Analyzer value, which represents the drive’s life expectancy in years.
Another Real life example here:
Big trip reservation agency has complained to Intel SSD endurance behind the RAID array, saying it’s not enough for their workloads. And according to I/O traces under OS the drive must have higher endurance to support lots of writes. My immediate proposal was to confirm it with the Endurance Analyzer. It has provided the understanding of what happened on the SSD device level, taking off OS and the RAID controller. After we ran the test for a week (including work week and a weekend), we got 42 years of expected life time on that week workload. Customer might be right if he measured peak workload only and projected it for a whole week, which is not the case for the environment.
Now you understand there are three important factors that effect endurance. We’re able to change two of them – workload profile and increase the over provisioning. But don’t confuse yourself – you can’t make High Endurance Technology SSD (such as P3700 or S37x0) from Standard or Mid Endurance drive (P3600/S3610, P3500/S35x0). They use different NAND with a different maximum number of erase/programming cycles. Likely, you can use the Endurance Analyzer to make an optimal choice of the exact product and requirements for the over provisioning.
At the end I have another customer story…
Final real life example here:
I want to address my initial definition of the endurance and two ways to measure it – TBW and DWPD. Look, how tricky is it…
Customer A did an over provisioning of the drive by 30%. He was absolutely happy with write performance improving on 4k block writes. He tested it with his real application and confirmed the stunning result. Then he decided to use Endurance Analyzer to understand the endurance improvement estimated in a days. He ran the procedure with a few days test. He was surprised with the result. Endurance in TBW has increased significantly, but the performance was increased too, so, now with 30% over provisioning on his workload he was not able to meet 5 years life span. The only way to avoid such was setting the limit for the write performance.
SSD Solution Architect