Intel SSD generational speed – in the Oracle context

With Oracle Databases (DB), the  speed of a “transaction commit” is dependent on the speed of an process called LGWR (ora_lgwr_orcl if on Linux), which provides redo log writing. This process of sequentially logging the updates, keeps your database durable and recoverable. Users will not be able to see the updates until the log writing process and the commit is finished. Virtually all transactional databases that require durability and recoverability concepts use some logging method, including most durable In-Memory Databases.

Steve Shaw and I have been collaborating and analyzing data for Oracle transactional commit with a linux tool called strace (system call tracer) and we find it interesting to compare  some “strace” data from the lgwr between the different generations of Intel SSD drives. In 2014 Intel will refresh it's PCIe interface-based drives with a very light storage protocol built for Flash and Non-Volatile Memories (NVM) of the future. This emerging storage protocol is called NVMe, which is an industry-backed standard to allow Flash based SSD's to use a more efficient protocol layer. With NVMe based PCIe drives you will not only be on a more efficient protocol stack, but you'll be removing SAS or SATA-based hardware interfaces.  It is good to look back as we look forward, to review some of the generations of higher-end products for the Data Center that a Database Administrator (DBA) might use for log writing in their database.

Steve posted the Intel SSD X25-E, 910, and our NSG lab system provides the DC S3700 data.

Intel SSD X25-E

io_submit(140566032744448, 1, {{0x7fd8125172e8, 0, 1, 0, 256}}) = 1 <0.000022>

io_submit(140566032744448, 1, {{0x7fd811e8ab08, 0, 1, 0, 256}}) = 1 <0.000020>

...

~ 22 microseconds

Intel SSD 910

io_submit(140566032744448, 1, {{0x7fd811e4c540, 0, 1, 0, 257}}) = 1 <0.000237>

io_submit(140566032744448, 1, {{0x7fd811e4cb50, 0, 1, 0, 257}}) = 1 <0.000227>

...

~230 microseconds

Intel SSD DC S3700

io_submit(140611364085760, 2, {{0x7fe2a0cce2e0, 0, 1, 0, 21}, {0x7fe2a0ccde60, 0, 1, 0, 21}}) = 2 <0.000057>

io_submit(140611364085760, 2, {{0x7fe2a0ccde60, 0, 1, 0, 21}, {0x7fe2a0cce2e0, 0, 1, 0, 21}}) = 2 <0.000055>

...

~55 microseconds

syntax: strace –T –p {pid of orcl_lgwr_SID} –o {output filename}

Some post notes regarding Oracle Log Writing;

  • LGWR uses async IO in most deployed Linux environments but it must wait to keep things serial since a database transaction log is a serial concept.
  • LGWR always writes on a 4k block boundary as long as you follow the special instruction of setting up Oracle Log Writing to use a 4k block size for the log buffer and hence the writing. Steve Shaw has provided this popular blog on how to setup your Log Buffers to use 4k blocks here.
  • There is a parameter used or followed in benchmarks called redo wastage, which is the unused space within the blocks because no 2 log write I/O’s are the same transaction or identical payload size. But the takeaway is this,  lots of redo wastage is not indicative of a problem, you want to count the commit and transaction speed you are getting on your system, that tells you if you are benefiting from faster transaction speeds that the SSD’s will provide.
  • LGWR itself is a 100% write workload, but archiving of this data for recovery means that the drive is not necessarily seeing a perfect write characteristic. As is common in IT scenarios that copy data for recovery,  rarely does a drive see "only" write behavior.
  • Oracle can multi-thread redo log files so a specific RAID solution is not mandatory.
  • It's important to realize that Hard Disk Drive technology cannot achieve micro-second performance metrics, their write operations function in milli-seconds. Many orders of magnitude slower, and a hard disk drive of supreme quality and 15,000 RPM performance will be price equivalent with most MLC SSD's today, do you own research on this. You will be surprised.

Some notes on Intel SSD:

The X25-E was an SLC drive, that means performance over price single level cell, and this drive is considered a standard of excellence 5 years in the making. Important things to think about with Generation 3 Intel SSD's such as the DC S3700 Series is their amazing endurance, which DBA's will appreciate since they "own the data". The incredible leaps in endurance plus the amazing IO consistency of the third generation controller technology which is something X25-E drive did not have. These data samples from the Intel Labs don't provide you a picture of what is amazing IO consistency with Intel SSD.  Measuring that consistency over a prolonged time is important to your user experience and the overall performance of your Data Center applications.

Amazing things are possible when you achieve microsecond IO's in the user space.

Published on Categories Archive

About Frank Ober

Frank Ober is a Data Center Solutions Architect in the Non-Volatile Memory Group of Intel. He joined 3 years back to delve into use cases for the emerging memory hierarchy after a 25 year Enterprise Applications IT career, spanning, SAP, Oracle, Cloud Manageability and other domains. He regularly tests and benchmarks Intel SSDs against application and database workloads, and is responsible for many technology proof point partnerships with Intel software vendors.