Is using flash a good way to improve performance in a virtual environment? Can a virtualized storage infrastructure be an option?
Flash does some wonderful things. I like flash as cache in the hybrid role, personally. You can expedite a certain number of disks by writing files -- or data sets -- that are being accessed a lot on the disk drive. [You can do that] instead of writing that to a flash device temporarily and serving those requests out of the flash layer. That's a perfectly acceptable use of flash, and I've seen it used very wisely. In fact, X-IO built that intelligent storage element directly into its product in a very elegant way. I'm not down on flash completely, but I think there's a lot of oversell right now around the idea of tier-zero arrays and the idea of having all-flash arrays.
For the money, I could probably do things a lot faster by virtualizing my storage. And that sounds weird; I'm saying don't virtualize or be cautious about how you virtualize your servers, but I'm not saying be cautious about how you virtualize your storage. Storage is a lot easier to virtualize than the workloads that run on servers. Most storage controllers are actually running server operation systems. A lot of people don't notice that, but if you squint hard when you're booting a VMAX from EMC, you see a copyright logo from Microsoft because they're running the Microsoft operating system 2008 R2 on a VMAX controller board. When you look at their replacement for Clariion, the VNX, that box is running Windows 7. So basically, what you've got is an OS environment running on a small circuit board that is a motherboard; it runs a little application that the vendor is charging too much for on the storage array, and that's how storage works.
Now, if I can go above all that, go above the layers of value-added software and just go across the storage hardware itself, everybody is selling a box with Seagate hard drives. There's no difference between brand X and brand Y at the hardware level. So we can virtualize that. We can surface those value-added functions that maybe you want to preserve at that virtualization layer and basically spread that goodness to all the storage rigs and only do the ones that have the name X, Y or Z on the side of the box. That drops your storage costs considerably. The best implementation of a virtualized storage infrastructure I've seen is at DataCore Software in Fort Lauderdale. I use their DataCore SANsymphony R9 product on about 4 petabytes of storage that I have in my lab. So basically, what we do is we virtualize the storage, which is nothing but aggregating it all under the control of the software controller. I have a dual redundant server that runs this controller software so there's failover in case one of the server heads dies, and I carve virtual volumes out of all the massive amount of disk I've got and read and write to them through a layer of memory cache. And then, instead of using flash, I use DRAM, which is much more resilient than flash and doesn't lose half of its performance when you write it. The first time you're writing to your flash card, you're going to get full speed out of it. The second time, you have to erase the cells that have been written before you write to them again. So, you have a decrease of 50% of the performance of a flash card.
There have been some other kinds of technologies introduced to try and spoof that a little bit, but the bottom line is that's how flash works. So flash is a bit of an oversell. I have a client right now -- a credit card company that does more than 1 million transactions a second -- and one of the leading flash vendors was just there trying to sell them their wares and said the maximum number of writes you can do to a cell location on a flash card is 250,000; we would burn through that maximum limit of how many times we could write data to that cell location in less than an hour on most of those products. And [since] they cost [about] $10,000 apiece, we'd be spending money like crazy to keep refreshing the flash components in the storage array.
So do the math. Go into it with your eyes wide open and tell the vendor you want a test first. The other nice thing about a virtual storage infrastructure is that when you want to move workloads around using vMotion or something, you can move the data that goes with it around too -- that virtual volume can move with the workload, so it's going to save you a lot of money in terms of your basic storage spend, and a lot of money in terms of how many times you need to replicate the same data to take advantage of all those cool things they talked about in the VMware brochure.
This was first published in March 2013