Software-defined storage: Making sense of the data storage technology
A comprehensive collection of articles, videos and more, hand-picked by our editors
The term "storage hypervisor" is being used more and more frequently in the storage world, and a number of storage vendors are now using it to brand their products. But while vendors are capitalizing on this technology, there isn't one clear way to define it. According to storage expert Mark Peters, all storage hypervisors do the same thing: pool storage. However, when looking closer at specific products deemed "storage hypervisors," they achieve this through different methods. In this podcast, Peters provides his definition of the storage hypervisor, its benefits and drawbacks, and how he believes it fits into today's storage market. Listen to the podcast or read the transcript below.
More and more vendors are branding their products as 'storage hypervisors.' Are they all talking about the same thing?
Mark Peters: I think the answer is both 'Yes' and 'No.' … Let me explain what I mean. You'll be familiar, no doubt, with the way that we do Latin taxonomy for many things. So you have Homo sapiens, where you have the genus and the species. And I think from a storage hypervisor perspective … they're all in the same genus, but then the species varies somewhat.
So at the genus level, for example, where they're the same, [storage vendors are] trying to do … for storage what server virtualization did for servers. Now, there's one distinct subtlety here, which is [that] many of the things they're aiming at … are similar with server hypervisors. If you think about it, the wonder of that was to turn one thing into many things -- one big server into multiple virtual machines. With storage hypervisors, you're essentially doing the [opposite]. You're taking many disparate storage parts and combining it into one pool -- similar intent in terms of efficiency and so on, but different way of doing it.
So that is at the genus level where the answer is 'Yes,' they are talking about the same thing. Down at more of the species level, yes, they're all abstracting storage management, but some do it very generally; others are very tied to the virtual server environment; some even talk the language of virtualization (like VMDKs instead of LUNs); some grew from file systems; some grew from original 'in the box' storage virtualization and just got bigger. And I think, as another distinct difference, we're going to see more people trying again to virtualize the storage that is in servers without actually adding anything else as well. So that's another way of doing it. At the genus level, lots of similarities; at the species level, considerable differences.
How do you define 'storage hypervisor'?
Peters: It's funny because I'm not sure there is an absolute definition that everyone agrees on. … Here's something that was from Wikipedia, and it's a mish-mash of a few different people's opinions. I'm actually going to read it because I think it's very good at the genus level -- this is not down to the species and individual components. 'The storage hypervisor, a centrally managed supervisor software program, provides a comprehensive set of storage control and monitoring functions that operate at a transparent virtual layer across consolidated storage hardware pools to improve their availability, speed and utilization.' … I think at the genus level that's very true. What's key is all of these things are virtualizing and abstracting storage controller management out of the storage system itself and putting it in software. But then there are other things that are perhaps more implicit than what I said but really need to be drawn out.
The better systems, the ones that I think people are going to coalesce and use, have two other very important aspects. One is that they're heterogeneous -- in other words, you can use multiple types of storage at once -- and secondly, they're agnostic -- they don't really care what that storage is as long as it complies to basic standards. So what you're talking about with that heterogeneity, that agnosticism or abstraction of control, is whether it's internal or external, [has] a cloud-like ability to deal with uncertainty and [is] very flexible. And if you think about it, IT has moved from mainframes to distributive models, and now we've got cloud and virtualization. Storage went from DAS, when it was mainframes, to now in the distributive world [where] [are] very networked -- whether it's SAN or NAS or whatever, and really what the whole storage hypervisor movement equates to is that more virtualized cloud aspect of virtualizing storage.
Recently, there's been the same sort of disparity with the definition of software-defined storage. Do you think there's a difference between that and storage hypervisor?
Peters: I think there can be a difference. … To try and make it very clear, by definition, when you say 'software-defined,' it is a software product. And so to that extent, storage becomes like an app. When you say 'storage hypervisor,' it could be just software, as I've been describing. But there are also versions of storage hypervisors that can be reliant on a particular piece of hardware or a particular style of hardware. So they're not wrong; they're just different. So I think, 'software-defined,' by definition: software-only. Storage hypervisors can be software-only but can also involve some specific hardware. In each case, everyone's aiming at the same thing: the flexibility, the management that I've talked about. But I think this is just another prime example in our industry of where semantics can get confusing. So we need to be very careful whenever we use words [such as] 'software-defined storage,' 'storage hypervisors,' much as we would be when we talk about 'cloud' or 'flash storage' or even 'thin provisioning.' Each of those phrases covers a multitude of different meanings. So we need to be very specific. 'Hypervisor,' as applied to storage, is a very trendy term because 'hypervisors' as applied to servers was very trendy. So whatever reason -- maybe it was too soon [or] there wasn't a broad adoption of storage hypervisors over the last few years -- and so I think many of the vendors in that area and many of us commentators are starting to use the 'software-defined storage' [term] because it's the same sausage, new sizzle.
What are the main benefits and drawbacks of storage hypervisors?
Peters: The benefits are many of the things we talked about. It's flexibility, agility, quality of service, one pane of glass, manageability, all those things … whether you are using internal or external facilities. I talked about the heterogeneity and agnosticism, which means you can have a meritocracy for the actual, physical storage you buy. But the net of all that really just comes down to a couple of things, and it's all underpinned by getting cost manageable, getting cost down. At the end of the day, that's why people do things in storage -- they may talk about the symptoms of capacity and management, but at the end of the day the cause, the real motivation, is all about cost. So you get the cost down by using storage hypervisors because you get better utilization, more flexibility, only one management tool, sometimes you speak the same language that you were using in servers -- all those sorts of things.
On the flipside, academically and conceptually, there's not much to really criticize. The challenge for those people, whether they call it 'software-defined' or 'storage hypervisors,' is that the whole concept is new, and our industry -- IT or storage, whichever level you want to go to -- tends to be fairly conservative. That's on the user side. On the vendor side, not all the big storage companies have fully embraced this concept. Some have, some haven't, and so that leads to some skepticism from some people, or a fear that somehow, if they go this way, they'll lose or miss out on something. I think most of that is unfounded because we'll see an increase in which the storage hypervisors have all the same features and functions that you're used to in your storage systems. But nonetheless, that lack of a standard approach and that sort of natural conservatism in the industry is probably the biggest problem. But we're going to have to change something. I told you the mainframe distributed to today's virtual, cloudy world. Clearly, something parallel [needs] to happen in storage, and I think that's where storage hypervisor fits, and so we'll see more of this as we move forward.