7 posts • joined Tuesday 20th October 2009 17:20 GMT
The biggest StorageGrid installation I've heard about is 800 PB on 3 continents. I forgot how many sites, but it was quite a few... so apparently some people DO put more than 100 PB in one namespace.
"ONTAP 8.1 could unify the separate 7G and 8.01 strands of ONTAP, and position NetApp to provide more scale-out features to cope with larger data sets."
Now really, Who would want to unify those two??
8.0.x comes in two flavours: 7-Mode (ex-7G) and Cluster-Mode (ex-GX). Same code-base, different environment variables to tell the system in which mode to start.
More Scale-Out? Cluster-Mode already gives you up to 28 PB (on 24 nodes) in a single namespace. I'm teaching NetApp and talked to a student 2 weeks ago, who will be setting up a new 5PB system in the coming weeks...
(8.0.2RC1 already supports 3TB drives which will proportionally increase the above numbers...)
"We might expect more use to be made of flash memory devices, something around data storage efficiency, and clustering improvements."
More flash? The 6280A already supports 8TB PCI-Flash. With 8.0.2 (RC already publicly available) this will increase to 16TB... SSD shelfs are supported for about a year already.
Storage Efficiency? Compression is available as of 8.0.1 (7-Mode). You can even combine it with deduplication where it makes sense.
No, 8.1 is simply the convergence of 8.0.x 7-Mode and Cluster-Mode with added benefits.
Didn't the benchmarks prove otherwise??
"Your cache becomes a bottleneck, because it isn't large enough to handle the competing storage access demands.
Better to build the whole array out of Flash, and (if necessary) use on-controller RAM for cache, methinks, than to use a flash cache with mechanicals hanging off the tail-end. "
The FlashCache on the NetApp system was only a fraction of the dataset used in the benchmark (~4.5% of the dataset, ~1.2% of the exported capacity), yet it had 172% more IOPs, delivering them with ~half the latency (198% faster). The system also had 488% the exported capacity at a fraction of the cost of the all-SSD system...
As opposed to the 'large sequential' workload, where predictive cache algorithms can really shine, I see only one use case for SSDs:
Unpredictably random read workloads, where the whole dataset fits into the SSD-provided space. I know one such application, NetApps with SSD-shelfs were deployed after extensively testing multiple vendor's solutions.
"that their systems are different in kind from mainstream clustered filers such as NetApp's FAS series. NetApp would disagree of course."
I guess NetApp would agree, actually. And sell you an ONTAP8 Cluster-Mode System. 24 nodes, 28PB capacity. If you need more, there's always NetApp's StorageGRID...
Automatic Sub-volume level movement??? Non-sensical on NetApp!
"The hints are strong that Data Motion is going to get automated and, hopefully, there will be sub-volume level movement to provide more efficient control of hot data placement, as NetApp's competitors do."
Now that makes no sense whatsoever on a NetApp. If only part of a volume is 'hot', already the second read-access to it would find it in (Flash-)Cache. Writes are always acknowledged to clients as soon as they're safe in NVMEM, so no speeding up necessary there. They're also automatically in Cache, so the next read would already find them there.
98% of use cases are transparently accelerated by Flash-Cache. You can put up to 8TB in a NetApp HA pair...
The other 2% (I know one case) can actually benefit from SSDs. But then it's a *whole* dataset that needs to be fast *all* the time, at the *first* access (because there will probably be no second any time soon) and you wouldn't want the risk of having to warm up the cache after a controller failover. That dataset would go to SSD as a whole, not 'sub-volume'...
I can see lot's of use-cases for automatic DataMotion, but (sub-volume) 'FAST' isn't one of them.
Disclosure: I'm a NCI, a technical Instructor, teaching people (among other things) how to use NetApp controllers. I like their technology, but I'm not an employee...
Deduplicate Zeros & Most Mature ?
I'd agree with 'most mature *hardware-based* thin provisioning'.
But if you compare the features described in the article with what NetApp offers in their boxes, it's a fraction only. Like
- specific space guarantees for volumes, LUNs, files
- volume autogrow
- snapshot (operational backup) autodelete
- deduplicating ANY 4K blocks of data (not just zeroes)
With all these safety measures built in, it's perfectly safe to use Thin Provisioning, provided you do a little monitoring, too.
I agree with Barry, Software is a lot more flexible, and these days pretty fast...
FCoE is an approved standard since June 3, 2009
You said: There are standardisation efforts with FCoE and data centre Ethernet afoot.
'On Wednesday June 3, 2009, the FC-BB-5 working group of T11 completed the development of the draft standard and unanimously approved it as the final standard. '
See also: http://www.fibrechannel.org/component/content/article/3-news/158-fcoe-standard-alert-fcoe-as-a-standard-is-official-as-stated-by-the-t11-technical-committee