The One Million IOPS game

Few months back I saw an press release on Reuters from Fusion IO and HP claiming to hit 1 Million IOPS with a combination of Five 320GB ioDrives Duos and Six 160GB IO drives in an HP Proliant DL785 G5 which is a 4 Socket server with each socket having 4 cores, that makes a total of 16 cores in the server. I went saying wow that is amazing, a million IOPS is something any DBA running a high performance Database would like to get hands on. But when I did a quick search on the Internet for on how affordable the solution would be, I was horrified to see the cost which was clsoe enough to buy me couple of Mercedes E class sedan, all though the performance was stellar the cost and 2KB chunk size made me say which application does a 2KB read/write anyways, the default windows allocation is 4KB.

As time went by I got busy with other work till our Nand Storage Group  told us that they are coming up with a product concept based on PCIe to show a real 1 Million IOPS with 4KB block sizes which application in real world uses. This triggered the thought on what takes to achieve a 1 Million IOPS using generically available off-the shelf components.  I hit my lab desk to figure out what it takes.

Basically getting a Million IOPS depends on Three things:

1. Blazing fast Storage drives.
2. Server hardware with enough PCIe slots and good  processors.
3. Host Bus Adapters capable of handling the significant number of IOPS


  Intel Solid State Drives was my choice, there has been a lot discussed and written about the performance of Intel SSD's and that was easy choice make. I selected Intel X25-M 160GB MLC drives made using 34nm process. These drives are rated for 35K Random 4KB read IOPS and seemed like a perfect fit for my testing.

Then I started searching for the right Dual Socket server, this Intel® Server Systems SR2625URLX with 5 PCIe 2.0 x8 provided enough slots to connect HBA's. The server was configured with Two Intel Xeon W5580 running at 3.2Ghz and 12GB of memory.

Search for the HBA was ended when LSI showed their 9210-8i series (Code named as Falcon) which has  been rated to perform 300K IOPS. These are entry level HBA's which can be configured to hook up up to Eight drives to eight Internal ports.

Finally I had to house the SSD's some where in a nice looking container, and a container was necessary to provide power connectivity to the drives. I zeored in on Super Micro 2U SuperChassis 216 SAS/SATA HD BAY, this came with Dual power supply and without any board inside it, but it provided me an option to simply plug in the drives to the panel and not worry about getting them powered. The other interesting thing about this Chassis is that, it comes with Six individual   connectors on the back plane so all each connector handles only Four drives, this is very different from active back planes which routes the signal across all the drives connected to them, this allowed me to just connect 4 drives per port on the HBA.  I also had to get a 4 slot disk enclosure ( Just some unnamed brand from local shop) in total I had capability to connect 28 drives.

With all the hardware in place, I went ahead and installed Windows 2008 enterprise server edition and Iometer (Open source tool to test IO performance). 2 HBA's were populated fully utilizing all 8 ports on them while other 3 HBA's were just populated with 4 ports only.  The drives were left without a partition on them. Iometer was configured with two manager processes with 19 worker threads 11 on one Manager and 8 on the other. The 4KB Random reads were selected with Sector alignment set to 4KB. The IOmeter was set to fetch last update on the result screen.




Once the test started with 24 drives, and felt I was short of few thousands to reach 1M IOPS so I had to find the 4 bay enclosure to connect another 4 more SSD's taking the total number of SSD's to 28. There was a Million sustained IOPS from the server with an average of 0.88 ms latency and 80-85% of CPU utilization.  Please see below pics for more pictorial representation of the setup.


Recently we demonstrated this setup at Intel Developer Forum 2009 at San Francisco, this grabbed attention of many visitors due to the fact that this is something an IT  organization can achieve realistically without spending a lot of initial investment, the good thing about this setup is that the availability of parts and equipments in open market. As Intel we wanted to get this thought started that High Performance storage without robbing a ton of money from your IT department's budget. Once a storage admin gets the idea on what is possible the industry will take more innovative approach to expand and tryout new setups using of the shelf components.

Next Steps:

I would be spending sometime to get this setup running with a RAID config and possibly use a real world application to drive the storage. This needs a lot of CPU resources and I have in mind one upcoming Platfrom from Intel which will let me do this. . I come up with followup experiments.

-Bhaskar Gowda.