If you follow my blog closely you’ll notice that I like to follow up on some storage startups. A lot of them I did meet at a TechFieldDay event earlier this year, others I’ve met at tradeshows or other storage events. One of them that I truly love but did not blog about yet is PureStorage. If you meet these people you’ll have to fall in love with them. The organization, the ‘vibes’ and not to forget: the product!
If I have to be very honest I have applied to PureStorage earlier this year when I was looking for a new challenge. They were not ready at that point to hire EMEA folks. A few weeks later some ($40M) Venture Capital fell in their mailbox but I was already on my way to Veeam, where I am now more happy than I could have imagined before joining!
But hey, times change and now they are here and it’s time for me to give those credits they deserve. Let me start first with that company thing as such and we’ll get to the technology later in the post. We visited The PureStorage HQ in Silicon Valley for the last session of Storage Tech Field Day 1. After 5 days on the wrong side of the pond I was extremely doped with a serious overdose of caffeine just to be able to follow until this last session. And it was worth it; we had Belgian style beers out of the coolest Symmetrix ever (video, 1:50) , we had Margerita’s, baked bacon and some crazy sh#t, smart #ss developpers to present (ie. co-founder John Hayes). No BS presentations, no Gartner numbers or quadrants and certainly no marketing slides. Now 6 months later they hired the 3PAR ninja team! Don’t get me wrong here. These are not HP guys trained to do 3PAR stuff, these were the real EMEA guys before they got acquired by HP. I have met these guys before and I must realy admit that there is no better fit for them nor for this company. Well done!
Flash well done:
We’ve seen a lot of different ways startups handle flash. The biggest difference between the startups and ‘the others’ is that these products are designed from the ground up for flash. It is not written for disks and have added flash to be faster than disks (ie VNX, EVA, Equallogic, Lefthand, …). Will these products run faster than their predecessors; yes. Are you using that flash to it’s optimal conditions? Not at all.
So how can you handle flash in a different way? You can use cheap MLC disks which gives you a decent $/volume but they wear out really fast when you hit them fast. You can use more expensive SLC disks that can run even faster and last longer and run some deduplication on it. Or … you can be smart and use both for it’s best conditions. PureStorage basically gets the writes in on an SLC disk and offloads them in batches to the MLC disks. If you look at their marketing it says 100% MLC but that is because the SLCs will be replaced in the near future with for example NVRAM. What happens is that 4k blocks come in, are grouped together untill you have 512k and is then offloaded to multiple MLCs. All of this goes extremely fast and is deduped inline.
One of the things that wears out flash disks is random writes. To do a write to a flash cell it has to be erased first before you can rewrite it with another value. PureStorage solved this by redirecting ALL writes to new blocks so that there are NO overwrites. So your system will be full very fast then? No, because PureStorage does garbage collection in the background but that is going to be a lot less than otherwise. One of the advantages here is that because the array already handles garbage collection they’ll need less overprovisioned free blocks and can use more active cells in the SSD. Another advantage here is that you could for example do an un-delete on human error. If an admin deletes a snapshot, clone or even a LUN, you can simply undo that action as long as it has no been cleaned off course.
Everything is based on ‘off-the-shelf’ hardware. This does not mean off course that it’s a whitebox everyone can build 🙂 It only means that there is no custom hardware like ASICs (ie 3PAR), customized SSDs (ie Nimbus Data) or high density disk shelves (ie Amplidata). The smart part of this type of architecture is that the technology is in the software and you can flexibily change that physical architecture. Remember my blog on the new bany 3PAR this week? The reason it took HP so long was the design of a new ASIC.
One of my pet peeves when I talk to storage vendors when they come up with new products is how you can move to a next generation. I do not like rip-and-replace technologies (Clarion/EVA/…) as they bring too much risks and thus you keep them alive way too long. One of my favorits here has always been Equallogic where you can just add a new model to the group and ‘evacuate’ an old model out of the group when it’s time to retire. Here also Pure has been smart and made Stateless Controllers. In short this means that the controllers do not hold any data as hash-tables and such. All the actual data resides in the storage nodes. I once killed an entire EVA by clearing the controllers (I know, stupid no?). Whit this type of architecture you could easily fail and replace controllers or move on to another generation. Not all startups are this smart.
To be considered an Enterprise Flash Array just provisioning LUNs does not do the trick anymore. Thin provisioning, cloning, snapshots, replication, … these are justs a few of the features. If you read back to what we said about writes always going to free blocks you already see at least two easy features to enable here: snapshots and cloning. As we are not overwriting data you can easily point a snapshot to those blocks that are still there. The only thing we’ll need to do here is not cleaning them through the garbage collection.
So you are running MLC which makes it cheaper, you are doing inline dedupe which makes it smaller. How much does it cost? Well, the starting point for a single controller with a single disk enclosure is around $ 100k street price (add 20k for dual controller). You’ll get a 5,5TB disk enclosure but it’s only licensed for 2,7TB RAW. Basically PureStorage invests together with you (marketing, haha). They could easily have done the same with half the amounts of disks, like some other vendors do, but then you’ll only get half of the performance. With an average reduction you’ll get about 10TB of netto storage here so you’ll end up at a pricepoint of $10/GB including maintenance AND ALL FEATURES ENABLED. That last point needs to be taken into consideration when looking at other solutions. Looking at performance this box should give you around 100k IOPs. This is mooooore than enough for most environments.
This product for that price from this company … where do I sign please? Seriously, if I had the money and I had a business case where this fits I would not hesitate. Let me know if you would and why
* As I was writing my post I recalled this technical deepdive from Nigel Poulton whas was also on that trip to Silicon Valley earlier this year.
* PureStorage website: great business case for PureStorage + VDI – read here