In Part 1 of this two-part blog series, I looked at using an Intel® Optane™ SSD, specifically the Intel® SSD DC P4800X , as fast storage for housing databases in MySQL. In that scenario, the entire contents of the database fit into a single SSD. Here, I will look at using an Intel® Optane™ SSD in conjunction with another primary form of database storage. This could be useful in a couple of scenarios:
- Where the database is too large to be contained on a single SSD and requires multiple SSDs spanning a RAID set, or
- Where a database administrator is looking to accelerate an existing MySQL server configuration without making major changes to their existing configurations (aka a bolt-on upgrade)
The basic premise here is to couple an Intel® Optane™ SSD with Intel® Cache Acceleration Software and create both read and write-back caches in front of an existing database volume. What if that database volume was already all-SSD? Could it still be demonstrably accelerated? I set up the below guardrails before conducting any experiments:
- The testing should include a mix of read and write database operations. As the different media types involved in the test have different read and write performance characteristics, it is important that the database workload exercise both read and write operations (especially important since there is a new class of SSD caching reads and buffering writes in front of the actual database storage in one of the configurations). From a database perspective, reads typically involve selection of one or more rows, and writes involve row inserts, updates and deletes (and index updates). The rule of thumb for the split of reads to writes in an OLTP environment is 70/30. Fortunately Sysbench provides hooks for such a workload. What is interesting and will stand out later is that the testing front-loads all of the read operations ahead of the write operations.
- Larger databases typically require a RAID set to support volumes of sufficient size. RAID controllers are still commonplace in servers within enterprise IT environments, so the RAID set for housing the database files was hardware-based with SSDs from the Intel® Solid State Drive Data Center Family for SATA.
Base hardware setup:
- Intel® Server System R2208WT2YS 2U server
- Dual Intel® Xeon® E5-2699v4 (22 Cores @ 2.2Ghz, 44 Cores total)
- Intel® Hyper-Threading Technology Enabled (88 Threads total)
- Intel® Server Board S2600WT2
- 128 GB DDR4 RAM
- Boot Drive: 1x Intel® SSD DC S3500 (240 GB, 2.5”)
- RAID Controller: 1x Intel® RAID Controller RS3DC080
- 3x Intel® SSD DC S3520 (1.6 TB, 2.5”) in a RAID 5 (for performance baseline)
- 3x Intel® SSD DC S3520 (1.6 TB, 2.5”) in a RAID 5 (for acceleration testing)
- 1x Intel® Optane™ SSD DC P4800X (375 GB, Add-in card) used as a cache device
The software stack used during testing:
- CentOS 7.3
- MySQL Server 5.7.17
- Intel® Cache Acceleration Software 3.1
- SysBench 0.5 (configured with 1.5TB database containing 96 tables with 32 threads during testing using a Guassian distribution)
- Inbox NVMe* drivers
- EXT4 file system
The test and results:
Testing was performed using two instances of MySQL running on the same server. The databases for each of the MySQL instances were stored on individual RAID 5 volumes built with three SATA SSDs (Intel® SSD DC S3520 Series). Both MySQL instances were staged with the same reference database generated by Sysbench. Intel® Cache Acceleration Software using an Intel® Optane™ SSD DC P4800X was configured as both a read cache and write-back cache for one of the two MySQL instances. A read-only workload was performed against the accelerated instance to warm the cache. Both instances were then tested in parallel, with Sysbench issuing 1,000,000 transactions against each. The “race” to 1,000,000 transactions is illustrated in the picture below.
The accelerated MySQL instance (in orange) was able to complete 1,000,000 transactions in a time of 350 seconds, whereas the baseline MySQL instance (in blue) needed 1,960 seconds to complete the same workload. To put that in perspective, the accelerated instance needed only 18% of the time the baseline instance needed. Taking the inverse of that, the accelerated instance ran roughly 5.6 times faster than the baseline instance.
It stands to reason the area under both curves is equal, meaning that on average, the accelerated instance sustains a noticeably higher rate of queries per second. However, the accelerated graph is not simply a smooshed (technical term) version of the baseline graph. Drawing attention to the baseline instance, there are two very distinct performance cliffs where the queries per second rate drops noticeably (within the first minute, and again around the 840 second mark). In the accelerated instance, there is only one appreciable cliff, which occurs in the first 30 or 40 seconds of the test run. The queries per second rate is then contained in a fairly narrow band until the end of the run. What could be causing this? Recall that above I mentioned that Sysbench front loads all of the read activities. Specifically, the complete order of transaction types goes: point selects (single row), range selects (multiple rows), sum range selects (multiple rows), order range selects (multiple rows), distinct range selects (multiple rows), and then row updates, row deletions, and finally row insertions. I attribute the first cliff in each instance to the transition from point selects, which involve one row, to the other selects that involve multiple rows (therefore requiring more I/O). I suspect that in the baseline instance, the transition to write operations around the 840 second mark is what introduces another drop in performance. I will have to add some debug statements to the scripts to mark the transitions to different query types in a future version of this. Why is this second cliff not observed in the accelerated curve? The answer likely lies in the near parity in the read/write performance of the Intel® Optane™ SSD DC P4800X. Simply put, the writes are being buffered by an SSD whose write performance far exceeds that of the RAID set behind it.
As a bolt-on solution, this experiment shows the potential for significant performance gains when using the Intel® SSD DC P4800X paired with Intel® Cache Acceleration Software in front of an existing, large database. Reducing workload run times leads to faster time to results, along with the ability to more frequently refresh those results. The other interesting twist here, at least in this configuration, is that if a workload were to become more write-biased, the acceleration benefits would increase due to the observed delta in write performance between the two configurations.
You may be wondering about the third use case I mentioned for Intel Optane™ SSDs that was illustrated in my first blog in this series: Extending memory. If you are curious about that use case, I encourage you to check out Andrey Kudryavtsev’s blog on that topic. Lastly, in acknowledgment is in order, as this blog would not have happened without all the help and contributions from Dave Leone- thanks Dave!
Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SysBench, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Source: Internal Testing
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark Limitations.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit www.intel.com/performance.
Intel, the Intel logo, and Optane are trademarks of Intel Corporation in the U.S. and/or other countries.
Copyright © 2017 Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.