VMware’s Software Defined Storage:
At first Cormac starts with a decent disclaimer; everything we talk about here is projects and vision VMWare is working on and is in no circumstances a final stage or GA-ready product. In fact, there are no date’s given, just vision and sharing at what stage in that vision we are today. Although some nay-sayers will have their own rant on sharing what without when, I do like the openess how Cormac presents it.
“VMware’s Vision: Provide storage services & SLA automation for all applications across all types of storage“. Off course this is a Product Marketing line but when read with human eyes I read: in the end it’s all about the application and making sure the data meets a certain level of service based on automated profiling, no matter what the underlying hardware would be.
This does not mean the underlying hardware (hence the hardware vendor) is becoming obsolete. Quit the contrary; VMware (or other platforms) will create their “industry standards” against what the Storage Vendors can be compliant. Think about what VAAI & VASA for VMware or ODX (Offloaded Data Transfer) for Microsoft have already done. It’s nothing more than offloading storage requests to the underlying hardware, but nevertheless asked for and automated by the platform on top.
The 3 (current) VMware Storage Projects:
vCloud Distributed Storage
Subtitle: Enabling per-VM data services on vSphere. Another name that circles from time to time is vSAN. This will use the local storage of vSphere hosts, SSD’s + HDD’s, and create a “Reliable Array of Independent Nodes”, codename RAIN. Simply put a datastore will be striped over multiple hosts and then it also will have replica’s of that datastore. The distributed array will provide it’s own reliability and availability services on per-VM based settings. Today the number of nodes would be up to 32.
My 2 cents: Although it might be expected that Virsto will still be sold temporarily as it exists today, it will suprise no-one that a lot of it’s technology will be part of vCloud Distributed Storage (I liked vSAN name better though). Another remark I have is if this will all be VSA based or if it would become part of the ESXi kernel. Although I would love it to be the latter I do question if this would not make the kernel too big again. Some people in the industry think this might kill SMB storage as we know today. I don’t have an opinion yet on this. What I do know is that the 32 nodes scalability is really nice already. There are a lot of other vendors out there that struggle with big scale out namespaces, let alone trying to scale beyond that number in block based independent nodes. Note: this is not to be confused with vSphere VSA that today has a max of 3 nodes and is – in my opinion – a toddler product in the SMB storage.
Subtitle: Enabling new tiers of storage on vSphere. VMware hasn’t done a lot for Flash up to today. We had “Swap to SSD” in vSphere 5.0 and SMART (Self Monitoring Analysis and Reporting Technology) in vSphere 5.1 but nothing that really changes the way you write IO because of the fact that it is flash. vFlash will do just that. It will integrate SSD’s, or any other local flash type for that matter, in to the vSphere storage stack. It will provide write-through and read cache software which would be transparent to the VM. There would be even an opening for 3rd party flash services.
My 2 cents: I like the fact that there will be an opening for 3rd party integration. Although I don’t exactly get where this will be going from VMware’s point of view yet but another example in the industry that I liked a lot was FusionIO’s ioMemory SDK that enables applications to talk directly to flash storage. What I do question here is how much this would “hurt” EMC XtremSW Cache and others that have done exactly the same thing but the other way around (using flash as a band-aid for “too slow” legacy storage – pun intended 😀 ). Same question here is wether or not this will be embedded in the ESXi kernel. I guess this makes more sense if it does.
Virtual Volumes – vVOLs
Subtitle: Enabling per-VM data services on the array. Basically vVOLS shifts external storage granularity from a volume base to a VM based methodology. Today a storage array has disks that are combined in a storage pool (mostly some type of RAID) and then carved up in LUN’s that are presented to the Hypervisor. In that LUN we put a lot of VM’s with it’s VMDK’s. Sounds familiar, right? One of the features vVOLS will introduce is the Protocol Endpoint (PE). The PE will be similar to that storage pool in which the array will put vVOL’s instead of LUN’s. This way we can create reliability and availability services like replication and snapshotting into the vVOL instead of into the entire LUN/Datastore. Important thing to remember is that the PE itself will not service the IO, it will only point the Hypervisor to where the IO for that particular vVOL needs to be directed. This will enable some scalability we don’t have today.
LINK: Virtual Volumes Technical Preview – Cormac Hogan, dec 18 – 2012
My 2 cents: this is the most exciting part for me as it talks about how the hypervisor platform engages with the underlying storage platforms from other vendors. This is where the SDS nay-sayers are proven wrong! It’s not VMware that is selling SDS, it’s VMware that is building the framework to offload any VM data operation to the storage. Guess what you’ll need for this … A STORAGE VENDOR! Early adopters will be on the winning hand here. I am thinking for example Tintri that already works on a per VM granularity but from within the array itself. They will be one of the first to be able to adopt VMware’s vision and technology. Anyone else dismissing this will be “slacking”.
Everyone has SDS?
important disclaimer: this is what I think today and I might change my mind on the topic as the industry and the buzzword evolves (covered my *ss nicely right?).
At first something about BuzzWord bingo. I get marketing and I get it that we need new stories to get new stuff sold. But just blurring out buzzwords because of the buzzword doesn’t make sense. Making your own definition of a new buzzword is not a good idea either, as this only enhances the confusion for the actual customer of whom I sincerely hope we all care about in the end. Think about how long it took to be a little bit on the same track with the whole “cloud”. And it still is not quit understandable for everyone. Another famous example in my daily work is Big Data; having/handling large quantities of data does not mean you are talking about “Big Data”. It simply means you have a large quantity of data.
So on topic: a new buzzword does not mean you can rewrap yesterday’s, or even today’s technology in tomorrow’s definition. Software Defined Storage as explained above simply does not exist today. Not at VMware, not anywhere else. The fact that you have a storage product that is sold as software and can be installed how you like it does not mean you have Software Defined Storage. It simply means you have Software that you can install how you like it and serves the purpose of Storage. Otherwise Windows Server 2003 with an SMB1.0 share is also SDS, and that is even a very recent example 😀 . I know that some analyst firms (IDC for example) are helping others to call – any storage that runs on x86 platforms – Software Defined but I have to disagree with that. x86 celebrates it’s 35th birthday this year … The difference for me is that Software Defined as it is presented by VMware today is Technology and Innovation driven and the other definitions are Sales and Marketing driven. A new technology does not really have to de dictated by a single party involved (VMware) but at least let’s focus on the right drivers for it.
I know I might step on a few vendors’ toes here that are preparing or launching there marketing machines for SDS but hey, everyone still has an opinion right? And please provide me with feedback in the comments. I love to be proven wrong on a daily base! It means I am learning and growing.