Lower your Cloud rents on Cassandra – the case for SSD and larger instances

Most people consider SSD's to be spendy, but that's not taking into account the full solution context.  What I mean here is what if you improve one tier of the application so you can remove another, resulting in a simpler architecture. The Cloud is famous for providing a more agile re-architecture foundation, but there are sometimes a lack of examples to work from as a reference. In this case, the re-architecture involved improving java memory usage, and increasing I/O throughput and latency characteristics which results in a smaller, more efficient Cassandra cluster.

As Companies such as Netflix mature within their Cloud expertise they become a model for others, and hence this blog. Alongside Netflix’s very open use of Cassandra, here is Intel Software’s Cassandra white paper explaining more about configuration options with smaller cluster sizes that depend on larger more efficient virtual machines, and less of them. The Intel paper looks at things more broadly for those evaluating different NoSQL options and how to setup Cassandra to use it wisely.

Let’s look at the Netflix blog and its TCO benefits.  The major change is to make the Cassandra cluster instance use more powerful systems and other configuration options allowing a measurably improved NoSQL data store to move to the next level. Netflix was able to remove the MemCache software layer in the process once Cassandra was improved to provide better latency.  SSD-based machines were part of that re-architecture. Less machines in a Cassandra cluster means manageability and dynamic resourcing becomes easier as well.

Removing software and cloud machine instances from the monthly rents meant significant savings to the bottom line over at Netflix. The savings were in the neighborhood of over $300,000 or roughly 50% of the cost structure of this Cassandra cluster in question. Those are big dollar savings, on top of that engineering staff time and simplicity of architecture also provide additional savings. See this table.

Table:  Netflix Cassandra on SSD Use Case - Cloud “server rent” savings:

System Configuration

On-Demand Hourly Cost

Total 3 Year Heavy Use Cost

36 x m2.xlarge + 48 x m2.4xlarge

36 x $0.45 + 48 x $1.80 = $102/hour


15 x hi1.4xlarge

15 x $3.10 = $46.5/hour


SSD’s in near-real time application tiers that require data retrieval in the low milliseconds are becoming more and more common. Some specific NoSQL technologies that can achieve this kind of performance and leverage SSD’s to improve TCO and application performance are, Aerospike, MongoDB and Cassandra.  Use cases such as contextual compute, and personalization of the web experience are good opportunities. A few more examples are user authentication, security web services, machine data processing, and low latency environments such as bidding, banking and trading services where time is truly critical.

Let’s summarize what we learn here:

  • Architect for greater simplicity! The Cloud is perfect for re-architecting at a more agile pace.
  • Baseline portions of your solution so you know you are headed in the right direction.
  • Explore references from experts to compare yourself to.
  • Architect for the user experience and measure the application end-to-end.
  • When moving or copying data in your architecture be deliberate about it.
  • Quantify your total costs
  • Achieving efficient scale out, means optimizing at the machine instance level first.
Published on Categories Archive

About Frank Ober

Frank Ober is a Data Center Solutions Architect in the Non-Volatile Memory Group of Intel. He joined 3 years back to delve into use cases for the emerging memory hierarchy after a 25 year Enterprise Applications IT career, spanning, SAP, Oracle, Cloud Manageability and other domains. He regularly tests and benchmarks Intel SSDs against application and database workloads, and is responsible for many technology proof point partnerships with Intel software vendors.