Many of you who are interested in the subject of optimizing platforms for database performance will be familiar with Kevin Closson's blog on platforms, databases and storage. Kevin knows his subject incredibly well, so I was interested in the subject of a recent post discussing the topic of whether you should put your Oracle database redo logs on solid state disks (SSDs). Kevin also refers to a post by Guy Harrison, where Guy's concludes, "I’m yet to see any evidence that putting redo logs on SSD is a good idea". This got me really interested because for the last couple of years, I have put my Oracle redo logs or SQL Server transaction log on nothing but SSDs! Maybe I've been missing something here, and with the increase of customers testing their environments on SSDs, I decided to put it to the test.
In our labs we have a 4 socket Xeon E7 based server, which I hooked up to both SSD and regular Hard Disk Drive (HDD) storage. To be precise for the redo SSD storage, I configured an Intel® RAID Controller RS2BL080 with 8 x Intel X25-E Extreme SATA SSD (model SSDSA2SH064G1GC) in a RAID 0 configuration. For the HDD redo, I used a well-known brand of storage array configured with 15 x 15KRPM Seagate Cheetah FC 4Gb/s hard drives, the host was connected by an Emulex Zephyr Lighpulse Fibre Channel Host Adapter to the HDD storage array. Again, all 15 drives were configured as RAID 0.
Note that this is not a like-for-like comparison and there are some differences between them, obviously there were 15 HDDs compared to 8 SSDs but also for example the RS2BL080 has 512MB of cache compared to the 4GB on the storage array, however I was not aiming to definitively do a 'which is better' test, the aim was to see whether putting redo on SSD (as I have been doing to date) is a good idea or not using the HDD confugration as a control. Also the aim was not to prepare any special configuration, all the options for configuring the storage were taken at the default for every step of the way.
On to the test I used Oracle Linux 5.6 and Oracle 220.127.116.11, and created a database with the data storage area on ASM, the data storage area was also on duplicate SSD storage with the same hardware configuration as the SSD redo area. Then, I configured redo logs on both the SSD and HDD redo area, first with raw devices and then with ASM. I used Hammerora for the workload and configured the schema for the in-built OLTP test. After that, I created multiple sessions without keying and thinking time and ran 4 timed tests on SSD/RAW, HDD/RAW, SSD/ASM, HDD/ASM collecting the AWR reports as I went. As expected, the tests generated a significant amount of redo up to 50GB over a 5 minute period. The Oracle transaction rate was in the multiple tens of thousands per second (so in the millions per minute). The redo log files were configured large enough so there was no log file switches encountered during each test to interrupt the redo.
Looking at the AWR report for "IOStat by Filetype" section showed the following results:
Looking at the actual Iostat information recorded during the ASM tests showed the following (which corresponds to the ASM data):
So, the redo on the 8 x SSD drives is writing 1.28X more data per second and doing 6.4X the writes/second although the avgrq-sz shows that the HDD configuration is writing more data for each operation. However, the avwait, svctm and %util show the the HDD configuration is busier and responding slower. The log file sync times in the Top 5 Timed Events looked as follows:
|Event||Waits||Time(s)||Avg wait (ms)||% DB time||Wait Class|
|log file sync||6,700,242||4,059||1||16.05||Commit|
|Event||Waits||Time(s)||Avg wait (ms)||% DB time||Wait Class|
|log file sync||5,366,376||12,709||2||46.83||Commit|
It has to be said that both are pretty good but not surprisingly from this redo related data when using the redo on the 8 x SSD drives the system is processing 1.25X the number of transactions as it does when the redo is on the 15 x HDD drives for exactly the same high throughput workload. ASM does have more impact on the SSD configuration than the HDD configuration but not significantly in this particular case with 18.104.22.168.
What about changing the redo block size? You can check your disk sector size in (for my case the SSD disk as sdb) /sys/block/sdb/queue. Note as this is a LUN this is presented by the RAID card although should be related to the physical disk sector size which in the case of the Intel X25-E as shown in the datasheet is 512 bytes
[root@wesex1 queue]# pwdn
[root@wesex1 queue]# cat physical_block_sizen
[root@wesex1 queue]# cat logical_block_sizen
Unless this is different from the standard 512 bytes Oracle will not permit you to change it and with the SSD configuration used the default options are the correct ones.
SQL> alter database add logfile group 1 '+REDO' size 90g blocksize 4096;n
alter database add logfile group 1 '+REDO' size 90g blocksize 4096n
ERROR at line 1:n
ORA-01378: The logical block size (4096) of file +REDO is not compatible withn
the disk sector size (media sector size is 512 and host sector size is 512)nn
SQL> alter database add logfile group 1 '+REDO' size 90g blocksize 512;Database altered.
Not too surprisingly, with the same system and disks very similar results are found with SQL Server on Windows and the Transaction Log on NTFS on SSDs - a different operating system and database but still excellent performance from exactly the same storage configuration.
In conclusion, should you put your database redo and transaction logs on SSDs a good idea? According to my findings yes. I suggest using the Intel X25-E Extreme SATA Solid-State Drive.
In this case, why does there seem to be a difference in opinion? Maybe I should rephrase my conclusion in that what I found precisely was that putting database redo and transaction logs on a RAID configuration of a small number of Intel X-25E SSDs gave me excellent database throughput. However, not all SSDs are the same and so I can't draw any conclusions about any other type of flash storage being suitable for redo or transaction logs from these tests. My recommendation is always to test your own setup.