What Kind of Storage Buyer Are You? (Part 2)

A Brief History of Software-Defined Storage

Software-Defined Storage (SDS) has become the new “It Girl” of IT, as storage technology increasingly takes center stage in the modern datacenter. That’s not difficult to understand, as SDS brings tremendous advantages in terms of flexibility, performance, reliability and cost-savings.

What might not be as easy for the new storage buyer to understand is “What IS SDS exactly?” Typically the answer is some reference to a particular software or appliance vendor, as though the term SDS is synonymous with a specific product or device. That’s savvy marketing, as companies would very much like you to think of their brand as the “Kleenex” or “Band-Aid” of the SDS world. What often gets missed in the process is any genuine explanation or understanding of SDS itself.

So, let’s correct that. I thought it would be useful to jump in a time machine back to the days of the first personal computers. Storage in those days was certainly not “Software-Defined”.  It was typically either a cassette tape recorder (with cassettes), or (if you were one of the cool kids) a floppy drive of some kind with the associated disks. Storage was “defined” by the hardware and physical media.

While the invention of the hard drive actually predates floppy discs by more than a decade, the first commercially viable consumer drives did not become popular until the adoption of the SCSI standard in the mid-1908s. (I purchased a SCSI drive for my own personal computer - a whopping 20MB - around that time for $795.00 … my how times have changed!)That’s where it started to get interesting. Someone realized along the way that – when you have “huge” amounts of storage – you can divide that up into separate partitions. Operating systems gained the ability to create these partitions. So, my 20MB hard drive became three “drives”: OS, PROGRAMS & DATA. It’s here where we see the first glimmerings of what would become Software-Defined Storage. All of a sudden a C: D: & E: drive did not literally have to refer to separate physical drives or media. Those 3 “drives” could be “defined” by the OS as residing on one, two or three physical devices.

So, we could at that point divide (or partition) a single media device into multiple drives. The next step was to make it possible to take multiple devices and make them appear as one resource. This was driven by the observation that hard drive capacity was increasing, but performance was not. The idea of using a “Redundant Array of Inexpensive Discs” (RAID) solved that performance problem (for the time being), but it was quickly realized that this came at the cost of lower reliability. Mirroring (RAID-1) and parity (RAID-5) approaches solved that issue, and now RAID is a ubiquitous part of almost all current data center storage designs.

For our purposes however, the important bit is how that changed the way storage was defined. With RAID, one could now take 2 or more drives and make them appear to the OS as one large drive, or some number of smaller drives. Storage was (and is) software-defined - at the level of the individual server.

While that might be technically correct, we still have a way to go before we get to what is currently considered SDS. It gets interesting when we take the general concept of RAID – using multiple resources as a single entity – and apply that to servers. This creates various kinds of “clusters” designed to improved performance, reliability or both. This is typical of something like Microsoft’s “Distributed File System”.

One problem encountered at this level is that shared storage resources cannot always truly act like physical drive. It’s often the case that you cannot use these shared file stores with certain applications, as they require a full implementation of a command protocol like SCSI or SATA. That’s where technology like iSCSI comes into play. It allows a complete storage command set (SCSI as you might guess) to communicate over a network link. Now it becomes possible to have truly virtualized drives, not simply shared file storage.

And that’s the level at which we get something that can truly be called “Software-Defined Storage”. All of these various technologies form a set of building blocks which allow a flexible pool of storage, spanning several servers. That storage can be divided-up (defined) as needed, expanded or contracted to meet business needs, and it works just like a local drive on the client systems which access that storage. That is the essence of “Software-Defined Storage”.

Of course that’s still a fairly primitive and basic implementation. Modern SDS configurations offer so much more. That will be the subject of the next post in this series.