3PAR is thinking about nothing in particular with its latest announcement: targeting and removing zeroes from migrated-in volumes and getting rid of them in thin copies and snapshots. Its idea is to use zero detection facilities in the third generation of the ASIC hardware in its InServ storage array, along with a so-called thin …
..that's what this sounds like to me at least- and for general purpose storage that will presumably have exactly the same problems as that normally engenders unless the book-keeping is kept up to date, e.g. your 100TB allocations are reduced to 25TB and the 1PB array reckons it can fit 40 of those in... fine, until anything is written to a bunch of them.
"It should appeal to IT budget holders averse to paying for nothing" ... but you're not necessarily paying for nothing are you ?... you still can't safely put any more data into the array because you run the risk of over-allocation. It'll make moving the volumes about much quicker though, and so could be useful for the snapshots and volumes for re-export only, but it doesn't stand up to their more general claims of the storage array being more efficient than another.
Is it me...
Or is this possibly going to create the weirdest error messages? Say it's a 10GB disk (I know, I know, just for easier maths) with a 10GB allocated but empty file - normally the drive would be full now. This thing spots and compresses it to <1GB, leaving 9GB free. A.N.other comes along with 5GB of real data, all is fine. Something comes along to use it's allocated 10GB space and "uh-oh, disk is full, can't do this". But didn't I allocate the space? Well... yes... but not anymore.
Not to mention, allocating empty space is meant to be useful for files likely to grow, performance-wise. Virtual disks cry out for you to allocate it in one go to allow the VM to perform optimally.
Small, cheap, fast
Pick any two.
Applies to most things in the computer world.
Re: Is it me...
Doesn't work like that - you still allocate the meta-data that says a block is allocated, but you don't use up the capacity for that block. Therefore the thin provisioning software knows that the "virtual" volume is actually 10GB allocated, but physically it is only using <1GB of real capacity. So when the user tried to write 5GB it would fail as the 'disk is full' at the 'virtual' level...
One of the interesting things I thought about when I first heard about VMware FT was the fact that if you want to run a VM with FT the VM must be running "thick" disks and they must be zeroed out (vmware calls it "eager zeroing"). I don't know why this is a requirement but FT won't work(as is) without it. With this new thin persistence stuff you can keep your FT volumes thin, and not incur any I/O penalty when VMware is making them "thick". Now if only there was a way to convert from thin to thick in VMware from a VMFS perspective online (I don't think there is but could be wrong).
To the Is it me poster, how's this for an example. MySQL DB table consuming 300GB of space, you delete 250GB of space out of that table, to "reclaim" it from MySQL's perspective you must optimize that table, which re-writes the table out. So when your done you have written ~350GB of data, but MySQL(and the file system) says there's only 50GB on the disk. SAN says 350GB is in use, file system says 50GB.
It's one of the gotchas with thin provisioning in general, it's dedicate on write. At least until this sort of thin persistence stuff gets out.
It's not magic.
You still need to track what you've got and monitor your systems so you stay ahead of your customers' requirements. Just means that most of the time you only need to buy (and power and cool) half the disk that your Windows boxes think they need.
Just make sure when you buy your array that the thin provisioning they implement allows you to overprovision should you choose to - not all of them do. If not, you're either left with half a rack of disk which thinks it's "full", or a full rack with no thin volumes. Genius.
Software vs ASIC
Chris, re your assertions about "software solutions" and "most mature thin provisioning" - I'd question the 'advantage' of having to develop an ASIC to perform a function that is basically performing memory compares. The Intel SSE hardware that does direct memory compare functions at 64bit word size with almost no impact to other standard CPU processing is a much more sustainable and cost effective way of doing zero detection. It took us 20 lines of assembly what 3Par need an ASIC macro for. Given the speed of Nehelam Xeons, with 20+ GB/s memory bandwidth per die... Similarily, since 3Par are, as I've said before the "grand-daddy" of thin provisioning, what have they been doing for the last 5 years... SVC got "Space-efficient" this time last year, and now we can migrate thick to thin, inline strip out zeros... and lots more on the horizon... I really don't think ASIC based, offload hardware is the future of our industry, commodity CPU with offload software is much faster to develop and generally agile
Deduplicate Zeros & Most Mature ?
I'd agree with 'most mature *hardware-based* thin provisioning'.
But if you compare the features described in the article with what NetApp offers in their boxes, it's a fraction only. Like
- specific space guarantees for volumes, LUNs, files
- volume autogrow
- snapshot (operational backup) autodelete
- deduplicating ANY 4K blocks of data (not just zeroes)
With all these safety measures built in, it's perfectly safe to use Thin Provisioning, provided you do a little monitoring, too.
I agree with Barry, Software is a lot more flexible, and these days pretty fast...