Your storage in the Cloud

Public Cloud, Private Cloud, Hybrid Cloud, bleh bleh bleh, … get over it, it is all called CLOUD and it will remain that way for a while. What I haven’t talked about yet is CLOUD STORAGE. What is it and how can you and I use it? I’ll try to get to a basic overview of what it means and how you could get started using it. As Cloud Storage will become much more important towards the future I see some blogposts coming up later this year with specific implementation possibilities.

WHY?
Why do we care about storage that is not in our own control? $$$ might be a first and very good reason. Failover is a second one and sharing it is another one. Let’s start with that first reason as I guess it has been the most important one. For a long time now we got away from the local storage and have put all our data in Block Storage devices (some call these SANs but that’s not a proper name for it) and File Storage devices (NAS). The biggest benefits here were high availability (shared storage) and more performance (more spindles = more IO). But with the increase of requests for volume, these devices tend to be a lot more expensive. So the price of just storing data has tremendously increased although for a lot of that data you don’t need such expensive storage. In come the Cloud Storage Providers.

How can they make storage so cheap? Well, you know it or not but those storage devices you buy today are waaaaay too expensive for what they are. A lot of the costs of your array goes to margins. Margins for the disk manufacturer, distribution channel or just ridiculous maintenance licenses. And the ones with the smallest margin in the whole story here are the integrators taking the time to install it decently for you. So how do these cloud providers make it so cheap? They build it off course. And they take down all unnecessary costs. And it’s volume. Add all up and you can make your own storage pretty cheap.

WHO?
The most common known cloud storage today is Consumer Class: Dropbox / SkyDrive / Google Drive / BitCasa / … that give you a fair amount of free storage. Note: if a service is free, you are the product! There are also a few fairly low priced alternatives for just that bit more than home usage such. I use ADrive for example that gives me more than the capacity I get from those Free providers for a fair price.

If you stretch the idea of these types of Cloud Storage one could go create it’s own Ghetto Cloud Storage. I wouldn’t go there myself but the idea is nice 🙂 Let’s focus from here on to what is considered to be Enterprise Cloud Storage. Who should be in this list (for now)?

One of the things we heard over here in Europe is the concerns about the Patriot Act. Does the Government have the right or not to look into private data? There are not a lot of non-US providers and let me assure you: if the US administration wants your data, it’s going to have it. A while ago blogger Aidan Finn wrote about it here. What we start to see now is that small local initiatives have taken that concern and made it a business case. Today I accidentally found a BeNeLux company (Green Storage Cloud) that does nothing else than store your data within the borders. I’ll leave it up to the reader to pick their preference.


HOW TO USE IT?
One of the easiest ways to use Cloud Enterprise Storage is a Cloud GATEWAY. You have appliances like TwinStrata or an easy-to-use Software solution like CloudBerry. Normally these appliances do not have on-site storage. They are more or less comparable to proxy-servers for IO.

Logical example of Cloud Gateways process: TwinStrata CloudArray:

Amazon has also it’s vey own gateway as a virtual appliance in vSphere (OVA format). Read the review here from Chris Evans (@TheStorageArchitect)

A second type of Cloud Storage is StorSimple, recently acquired by Microsoft. This appliance is the middle way between local storage (SAN/NAS/DAS) and the cloud. They use the principle of tiering and just use the cloud as the last tier. Let me show you how this happens in a picture:
Red: Flash / Green: SAS / Blue: offsite
So … this appliance shows NETTO 100TB as shared storage on site. But in the appliance we only have 10TB of local storage NETTO. Those 10TB are deduped storage on 2TB of SAS/NL SAS disks and 0.4TB of flash disks. And the big blue part on the right? Yes, those tier 1 public cloud storage vendors. This was really a unique player in the market and the acquisition by Microsoft will definitely disrupt the market here.

The 3rd way ‘out’ is when your application can write directly to Cloud storage through their http API’s like REST. Or you could even upload content to Amazon S3 through a simple web-form. If you write your own software there are also Amazon integration Developers toolkits from Force.com.

BUILD IT?
Next to building your own storage array with off-the-shelf components and open software you can also build your own Cloud Storage. VMware did an attempt in starting an Enterprise alternative for Dropbox which was called project “Octopus”. At VMworld 2012 however Octopus had been moved to the “Horizon Suite”. The Horizon Suite is where VMware has put some features together that are tied to End-User-Computing. Simon Seagrave drill the suite down for you: read here, I asked the Horizon team what this means for the product as such:
  1. Is Octopus, now known as Horizon Data, only available as a feature of Horizon Workspace? – A: Yes 
  2. Does Horizon Workspace only work on mobile platforms (iOS/Android)? – A: No, there is also a desktop client for Windows and Mac.
  3. Is Horizon only a object base platform (ex DropBox) or are there possibilities to share the data through file protocols (NFS/SMB)? – A: There is no intent today to use the Horizon Data platform as a file services. The opposite does exist “on the radar”: a sort of Enterprise Gateway for existing NFS/SMB stores.

    Bottomline: you really could build your own Enterprise Dropbox based on Horizon Data. I’ll definitely will try this out myself later.

Others that play in the build-it-yourself area are for example EMC SyncPlicity & Egnyte. These are 2 solutions mentioned by Ray Luchessi in his latest blogpost: “Enterprise File Sync“. I have no knowledge today on these products so I’ll get in there later this year.

PITFALLS:
The most classical pitfall of all times: SLA (Service Level Agreement). I’ll give you a small exert from the Amazon Glacier SLA. Amazon Glacier is the CHEAPEST storage you will find today ($0,01 per Gb per month). The SLA however is open for discussion or at least leaves all responsibility open towards RTO (Recovery Time Objective). This is how it sounds: “Retrieving archives from Amazon Glacier requires the initialisation of a job. Jobs typically complete in 3 to 5 hours”. Did you read the same as I did? What if it took 10 days? … I did not say that these services are unreliable, just saying that you have to know what you buy! The information (data) is YOUR responsibility even if you signed and paid for a contract with someone else to store it.

PAY AS YOU USE:
A common factor amongst cloud storage providers is the way it gets invoiced. The price per Gigabyte is very low but that is only the cost of storage. The real cost lies in the handling and transferring the data. Uploading or downloading 1 file is a “request” and gets paid for. Very little but it gets paid for. And the second thing is the transfer. The amount of data transferred per month is also paid for. If you add these two things up it’s actually pretty smart: either you’ll have a lot small files or you would have a very small amount but very big files. In both cases the provider gets paid. Once the data sits offsite it is very cheap. This model is IDEAL for storage you hope you’ll never need 🙂

MY VIEW:
Keep a close eye on Cloud Storage! I know I will. There is a lot of data that I even personally have and don’t want to lose. Why not put my data with 2 different Cloud providers? If I would do it for myself, why wouldn’t a company? Seriously, think about this.


So as you might notice, this was not a deep dive. This was my knowledge and starting points as of today and I’ll get through some/a lot of it this year. Maybe my 2012 was Flash Storage and 2013 will be Cloud Storage.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.