Dual parallelism: Intel NVMe SSD & Oracle TimesTen

The update logging speed of a database and running on SSD, especially for high performance operational and real time analytics, where update speed is key, has been well documented. The area where Hard Disk Drives (HDD), are still often used for even In Memory Databases (IMDB), is in the persistence layer. The actual DB files or checkpoint recovery files themselves are what I mean by the persistence layer. One reason for the HDD usage is high capacity requirements, and the lower cost per GB of HDD has been a factor in remaining with the old technology rather than SSD’s. Current Data Center SSD’s excel at GB density (at the 2.5 inch small form factor or SFF), power efficiency, noise, heat and of course IOPS per dollar spent. Even so, performance requirements are often the main criteria engineers look at in a quick evaluation of whether HDD or SSD should be used in a tier or area of a Database Solution.

Another overlooked area has been operational efficiency of the database for the engineers and database admins. There are common and regular tasks that require faster storage even for and IMDB, which operates primarily in DRAM, but persists onto some storage medium. Such overlooked yet common tasks are database startup and shutdown time, (especially when the db instances are elastically started in the Cloud), data backups and recovery, and migration of database data for staging and transformation uses. All of these areas get a serious boost from Intel NVMe SSD where the bandwidth is in the GB/sec, even surpassing the bandwidth of up to 20 high performance HDD’s in 1 NVMe SSD as seen in Christian Black’s recent blog on NVMe SSD efficiency.

So how does the parallelism happen all the way up the stack to the server software that has to be fully initialized before the first user connect can be serviced?  Here IMDB’s that can parallelize and leverage the parallel NVMe driver features that exist with Intel’s new P3700 SSD can really shine. Intel partnered with Oracle and we studied and tested the new release of the Oracle TimesTen which ships a parallel checkpoint recovery feature. This feature lets you set the I/O threading capability against the checkpoint recovery files (or data persistence layer), and load a DB of hundreds of GB’s much faster than before. Below are the findings for a 100GB database that was built using HammerDB’s TPC-C DB creation script. The parallel nature of loading the database really shines even over using standard (SAS/SATA-based) SSD’s which harbor the expense of also paying for a DRAM-backed SAS storage controller which you would not need with a PCIe SSD. Be aware that vendors such as Intel are using the PCIe slots in the servers to leverage larger SSD densities per slot, so we can take you deep into the TB per device realm that all of us data nerds crave. So we hear your needs for many TB’s per device!

Here are our testing results:

Database Instance Database Size Startup Time (Minutes) Up to this many times faster…
 HDD (1x Enterprise Drive)  100GB  31 minutes  Baseline
 HDD (10K 4x RAID 5 Drive)  100GB  <13 minutes  2 -3x faster
 HDD (15K 4x RAID 0 Drive)  100GB  <11 minutes  2-3x Faster
 SATA SSD (1x 6 Gbps)  100GB  <4 minutes  Up to 10X faster
 Intel NVMe*  &  PCIe* P3700  100GB  < 1 minute (50 sec)  Up to 37X faster

Note 1: The system used for this test was a Dell R720 server with 2 each Intel Xeon CPU E5-2690 v2 @ 3.00GHz (10 cores and 20 threads per CPU) and leveraging a Dell PERC H710P on board controller with Fast Path (dual core technology) enabled.

Note 2: To do a fuller evaluation of whether SSD’s outperform on more cost vectors such as dollars per IO operation, try learning from the Intel SSD TCO tool at this link: https://www-ssl.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-ssd.html

Note 3: NVMe is a storage host controller protocol implemented in a host device driver for today’s and tomorrow’s memory based storage devices.  PCIe (version 3) is your high performance IO hardware bus that most of today’s highest performing SSD’s are implemented for.

I welcome your comments...