Intel Optane SSDs are ultra-fast and we wanted to share a few things that you will want to know about Linux to help you get the most out of one of the world’s fastest SSDs. Optane is an SSD that can achieve sub-10 microsecond response time and operate as Software Defined Memory. So a whole new world of application use cases are now evolving around this device.
There are just a few things you want to know and do before you run that first fio script to test the device. It’s fast and easy, so you can quickly get into your application efforts. You should have your own fio script that matches the needs of your application. This document is there as a simple helper blog to get you started.
Optane SSDs perform best when they are used in a newer architecture (with Intel Xeon Scalable Processors) and a higher performance processor with a base frequency of near 3.0 GHz is recommended, but not required. Optane will of course work on slower CPU's, you’ll simply not see as much throughput at one (1) worker. The P4800X drive is simply an NVMe SSD so any x4 capable PCIe 3.0 slot will work fine for connectivity. The U.2 interface is also available at the same time as the add-in-card shown in the picture above, so choose an NVMe capable server with front enclosures if you wish. What else do we recommend in Linux specifically?
Steps to improving performance of Intel SSDs on Linux OS
Step 1: Put your CPU’s in performance mode
# for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do [ -f $CPUFREQ ] || continue; echo -n performance > $CPUFREQ; done
Ensure the cpu scaling governor is in performance mode by checking the following:
# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
You should see performance as the return of this command.
Don’t forget to make this setting persistent between reboots by changing your Linux restart configuration (rc).
Step 2: Disable IRQ balance
In kernels before version 4.8 the irq balancing was not managed efficiently as it is now by the in-box Linux nvme driver. So if you are on older kernels than 4.8, please turn off the irqbalance service and run a short script (below) to balance your irq’s to allow for the best io processing possible. Here are the steps on how to do this between Ubuntu and CentOS.
Set ‘Enabled’ to “0” in /etc/default/irqbalance on Ubuntu. As shown here, you can disable the service with the following command on CentOS:
# systemctl disable –now irqbalance
In CentOS 7, you can use these steps to stop or make it permanent between reboots.
#systemctl stop irqbalance #systemctl status irqbalance
It should show Active: inactive (dead) on the third line.
Now to make this permanent.
# chkconfig irqbalance off # chkconfig irqbalance
It will show “disabled”.
Step 3: Setup SMP Affinity
Here is a bash script to set SMP affinity.
#!/bin/bash folders=/proc/irq/*; for folder in $folders; do files=”$folder/*”; for file in $files; do if [[ $file == *”nvme”* ]]; then echo $file; contents=`cat $folder/affinity_hint`; echo $contents > $folder/smp_affinity; cat $folder/smp_affinity; fi done done
Step 4: Choose appropriate ioengine, and I/O polling mode if available
Now how to check your configuration works. The most critical performance will show itself at QD1 (queue depth 1) with just 1 worker thread. You can run this with any number of ioengine. That said, Intel uses polling mode via ioengine pvsync2 with the hipri option. This ioengine requires Fio 2.18 or newer version. Polling mode requires Linux kernel 4.8 or newer. If you are on an older than 4.8 kernel you need to fall back to a different ioengine, say libaio direct or sync as your applications require and this will not provide the same performance as a polling mode driver. Amazing performance will still be achievable, however.
To enable polling mode -
# echo 1 > /sys/block/nvme0n1/queue/io_poll
Example of an fio job parameters file:
[global] name= OptaneInitialPerfTest ioengine=pvsync2 hipri direct=1 buffered=0 size=100% randrepeat=0 time_based ramp_time=0 norandommap refill_buffers log_avg_msec=1000 log_max_value=1 group_reporting percentile_list=1.0:25.0:50.0:75.0:90.0:99.0:99.9:99.99:99.999:99.9999:99.99999:99.999999:100.0 filename=/dev/nvme0n1 [rd_rnd_qd_1_4k_1w] stonewall bs=4k iodepth=1 numjobs=1 rw=randread runtime=300 write_bw_log=bw_rd_rnd_qd_1_4k_1w write_iops_log=iops_rd_rnd_qd_1_4k_1w write_lat_log=lat_rd_rnd_qd_1_4k_1w
Results from a system with Intel Gold 6154 CPUs and Linux 4.13 kernel
Summary output from fio:
fio-3.0 Starting 1 process Jobs: 1 (f=1): [r(1)][100.0%][r=480MiB/s,w=0KiB/s][r=123k,w=0 IOPS][eta 00m:00s] rd_rnd_qd_1_4k_1w: (groupid=0, jobs=1): err= 0: pid=12340: Thu Oct 26 16:33:09 2017 read: IOPS=117k, BW=457MiB/s (480MB/s)(134GiB/300001msec) clat (usec): min=7, max=239, avg= 8.25, stdev= 1.37 lat (usec): min=7, max=239, avg= 8.27, stdev= 1.37 clat percentiles (usec): | 1.000000th=[ 8], 25.000000th=[ 8], 50.000000th=[ 8], | 75.000000th=[ 9], 90.000000th=[ 9], 99.000000th=[ 11], | 99.900000th=[ 34], 99.990000th=[ 39], 99.999000th=[ 57], | 99.999900th=[ 71], 99.999990th=[ 101], 99.999999th=[ 241], | 100.000000th=[ 241] bw ( KiB/s): min=383671, max=501112, per=99.98%, avg=468391.67, stdev=20874.43, samples=299 iops : min=95916, max=125278, avg=117097.94, stdev=5218.58, samples=299 lat (usec) : 10=98.58%, 20=1.22%, 50=0.20%, 100=0.01%, 250=0.01% cpu : usr=4.20%, sys=95.78%, ctx=9383, majf=0, minf=29 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=35135273,0,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=457MiB/s (480MB/s), 457MiB/s-457MiB/s (480MB/s-480MB/s), io=134GiB (144GB), run=300001-300001msec Disk stats (read/write): nvme4n1: ios=35120877/0, merge=0/0, ticks=246137/0, in_queue=245101, util=81.75%
We hope these simple steps provide you a great first experience with Optane P4800X SSDs on Linux, out of the box. Now comes the fun part. It’s time for you to achieve amazing innovations and a new level of memory flexibility for your business goals, as getting more per server just got a lot easier.
Feel free to reach out to Intel at any time on our support site and we’ll be happy to give you more help on achieving amazing performance with Optane.