NetApp CEO Tom Georgens thinks 3PAR, Dell, EMC, Compellent and others are wrong - automated tiering of data across different levels of drives is a dying concept. The tiering idea is that data should be on the correct type of drive at any point in its life cycle. Fast access data should be on fast access solid state drives (SSD) …
more or less agree
"Now it's saying that anything beyond automated data movement between SSD and SATA is a short-term fix to a dying problem."
I'm enclined to agree on this one, even if it seems a bit out of the main stream thought. Why bother with tiering ? The SW can have access to each performance counters, after all, so it should be a matter of times before the data is placed on the right support automatically. Who cares about OS memory/swap management these days?
"There is also just a scent of NetApp not being that excited about scale-out NAS, as IBM is with its SONAS product. Unlike previous earnings calls, NetApp did not mention Data ONTAP 8 and its clustering."
On this one, I think Netapp is doing a mistake. They'll probably watch the Pillar/Oracle/IBM train pass before them, and devour their market share. Sclae-out is the way, particularly in those low-finances days.
Anon, as the above may be business intelligence to vendors.
In my experience -sooner or later- you´ll have expended your cache and start thrashing the disks,
but then again, it was a CEO speaking, so what do I know about storage operation...
Earth Round, Pope Catholic
Well they would say that, seeing as they don't really 'do' high end.
Hardly an earth shattering revelation from a storage vendor is it. "Use cache to reduce required spindles or disk speed" We've all designed solutions doing exactly the same thing. They just happen to have a fancy name for it and a slightly different way of implementing it.
I think the markets for both drive types will still exist for a while though. They're not really *that* different these days any differentiation is just a choice of the drive manufacturer who quite likes his SAS / Enterprise SATA margins thanks very much.
Always an interesting discussion when someone says that a solution 'requires' SCSI or SAS, because you know SATA drives and controllers haven't sped up at all in the last 5 years. "But I need high throughput......"
"Always an interesting discussion when someone says that a solution 'requires' SCSI or SAS, because you know SATA drives and controllers haven't sped up at all in the last 5 years. "But I need high throughput......""
Just made me think, off-hand, what would happen if a hard drive implemented 2 read/write arms (positioning the additional one on the opposite side/corner), instead of just one? Simultaneous data access anyone?
Mines the one with the patent application in the pocket.
They would say that #2
The reason some vendors might sidestep offering automated tiered storage is that it's hard to do right and their architecture may prevent it. If NetApp can’t offer this as an automated feature they will have to manage this shortcoming. My firm has installed automated tiered storage from Compellent and achieved major cost-savings at every one of more than 80 sites across the UK and there is no sign of any slowdown for this feature; demand for it continues to grow.
To think that storage tiering is dying is like calling the earth flat. It shows a lack of undertsanding, in this case of the high-end storage market, which is not surprising given the source of this comment. It may die in about 10+ years, but not any sooner. It is just now being implemented in most large IT shops. The storage industry moves at glaciel speed, especially when adopting new technologies. SSD adoption has been minimal today due to a variety of reasons and will still be minimal years from now - primarily for cost delta reasons relative to HDDs. Tiering, especially performance based tiering, either based on read/write latency or IOPs is a core storage capability that will be implemented broadly by all large IT shops. Given that NetApp has no equivalent automated performance-based tiering capability, it is not surprising that they would bash the tiering concept. Self-serving to say the least.
With respect to the Scale-out comments, NetApp has to be really upset that IBM did not choose to build their SONAS around OnTap8. As NetApp's largest NAS OEM, IBM had full access to all the detail around OT8 and surely evaluated it in great detail. This is a embarrasment to NetApp and hints at the end of the NetApp/IBM relationship given the substantial overlap in products. Witness the Dell/EMC relationship to see how this will play out. Given it has taken 7 years for NetApp to bring OT8 out after the Spinnaker acquisition, its not surprising IBM lost faith in them.
Having No Tears, Clowns Imagine Them Useless
If NTAP practically supports just two storage tiers (FC RAID 6 and SATA RAID 6) and doesn't allow users to use different tiers for production vs. snapshot volumes, NTAP might very well minimize their value.
In the architecture and customer base with which I'm most familiar (I work at 3PAR), I can tell you tiers are must-have technology. Especially in virtual server environments, the ability to support simultaneously and to move non-disruptively between multiple tiers is essential to the efficiency and agility customers require. If you can do it from a shared, reservaltionless pool of disks, and tier snapshots independently, all the better. Across the 25 or so tiers we support, various customer applications and environmetns leverage them all and move between them as needed.
Now I am really confused
I watched the Rap, and then this informative video...
25 tiers huh? That sounds like some marketing fluff if I've ever heard it. Please, enlighten us all as to what your 25 tiers are.
As for virtual environments "requiring" tiering... that's the biggest load of crap I've ever heard. Virtual environments require 15k drives with low latency as the workload looks nearly identical to a database as far as storage is concerned. Tiering it is a great way to piss off end users when they start having applications time-out while your storage array decides that "whups, I guess that block was on the wrong tier".
and... what a shocker, 14 years of marketing experience.
"Geoff has over 14 years of finance and marketing management experience in the enterprise storage and server industries."
NetApp is doing "tiering" and scale out, but dedupe and other features are more important.
Seems netapp is keen on the 8.x roadmap, which is pretty heavy on both scale out and tiering features... I think the difference is that they are doing so without having 20 different levels of drive technology in the mix.
I don't necessarily see huge improvements with write performance for SSD vs. SAS/SATA or FC, as write performance is generally not an issue, which is the primary reason SSD adoption is low. Using PCIE cache controllers addresses more important performance areas, esp. since NetApp's controllers are a little slim on the memory (HINT NTAP, HINT)
As For scale out, I'm guessing most customers aren't running a renderfarm or HPC environment; GX is and has been available to meet these customers needs for some time. Also, The Massively Parallel HDFS/GFS type web architectures require a different product strategy and interfaces than NFS/CIFS/FCP/ISCSI type storage devices.
I find it ironic that while some vendors are forcing ZOMG SCALE OUT! and unnecessary tiering sizzle down customers throats they are ignoring the pressure in most shops these days to *Reduce* the amount of boxes and complexity in their environments, This means fewer disks and fewer controllers (esp. disks).
I'm not saying that scale out doesn't have it's place (the Ibrix guys work for NetApp IIRC) but it's worth less to more folks than technologies like Deduplication and balancing out IOPS/capacity tradeoffs with PAM.
Don't de-dupe your primary!
NetApp really are flogging a dead horse with the primary de-dupe argument. Firstly, why would you spend all that money on big enough controllers to support the performance requirements of your applications only to then ask them to carry out de-dupe. And it's a big ask, it kills the performance. Oh, and then because your primary workloads are likely to include databases which you can't dedupe using ASIS you don't actually see much space saving anyway. With disk capacities getting bigger, a lot of environments end up with more capacity than they need anyway just to satisfy the spindle requirements for their performance needs. Therefore, there is spare space anyway rather than turn your v expensive NetApp box into a de-dupe engine that will save you a paltry few % of space for a performance overhead.
Tiering, caching, HSM, etc.
It is all in how you draw it on the whiteboard. This is a silly argument.
Integrating the most efficient bits into the right place in "the stack" will always be a good idea. Everyone will try to do it, or make the case that a single "uber-easy to manage", one way fits all approach is best.
I am not aware of anyone who does not agree that SSD and SATA will squeeze out FC spinning disk. But that will happen over time.
Won't Shed a Tier
To answer Coward’s earlier question about 3PAR tiering, the number of tiers (service levels) stems from the number of unique combinations of RAID type (0,1,5,6), drive type (FC,SATA), degree of RAID efficiency (at least 2 meaningfully different degrees per RAID type, except RAID 0), and degree of RAID isolation (by drive magazine or shelf). Even in highly consolidated environments with many different applications, a given user may use only a handful of tiers. However, one user’s ‘handful’ may be quite different than another’s based on circumstances. User’s choice of tiers come from balancing SLA requirements like application availability and performance needs (r/w mix, block size, random/seq) with financial, physical, and time constraints like OpEx/CapEx budgets, schedules, floor/rack space, power/cooling, etc.
With respect to Coward’s observation that “virtual environments require 15k drives with low latency as the workload looks nearly identical to a database as far as storage is concerned,” this is generally true with respect to traditional array technology. However, it doesn’t hold with massively parallel storage architectures. In 3PAR’s case, for example, every controller and every drive of a type participate equally in servicing each volume. Dedicated ASICs within each controller also accelerate RAID 5 and 6 by performing XOR calculations in HW. These features allow users to leverage SATA drives and RAID 5/6 widely in virtual server environments. Here is one such use case as discussed on YouTube by the CTO of Terremark, a leading hosting services provider: http://www.youtube.com/watch?v=bhmyMxnriRo (most relevant portion begins at 5:45).
We're not good at it, so.. "Tiering is dead."
Wow, how obvious can it be that NetApp is giving up on the tiering of storage to go with a simpler and easier to design solution. If you look at Compellent's capabilities and the options presented to customers, it offers a multitude of tiering options so that customers can decide the what and how of tiering.
It's kind of frustrating when a vendor goes about telling me, the customer, what I do and don't need.
Re: Don't de-dupe your primary! && moar FUD
Dedupe isn't for every data set, but it works great for many of them. vmdks in particular, and other types of big ass files you can see up to 75% savings in some cases add block compression and you can squeeze out some impressive space savings. As for the performance penalty, it's just CPU, which is cheaper then memory. Since it runs as a low priority background process, it affects performance as much as disk scrubs do (they don't FWIW).
This means you can actually get more efficiency out of things like SSD, improving overall performance as deduped data sets fit better in memory. As you point out, there is a constant discrepancy between "Extra" space due to drive capacity and IOPS requirements. Dedupe + PAM (or just more memory netapp plz kthnx) acts to balance this tradeoff out to achieve a balance between the IOPS and capacity requirements.
Geoff, Texas Todd, 3par and Compellent ....
You guys are awesome block heads (Not meant as an offence, just pointing out that you only do block protocols) but you're missing a couple things that are different in the HPC and to a certain extent the "cloud" space.
1) NFS 4.1 means the "clients" can do the tiering. They can frankly do it better than any array can. Heck, you can't get clever with the automounter and NFS v3 to achieve similar results. This magic block tiering strategy is a head scratcher for these environments.
2) I wouldn't describe any FC block based system as massively parallel. Topping out at 8 or 16 controllers in not massively anything. I would argue that both compellent and 3par products are impressive, but not *MASSIVE*. NTAP, ibrix, and Isilon are larger, but aren't *MASSIVE* either. Well, maybe Ibrix but I digress.
3) Most Virtual environments, even Netapp ones keep virtual machines on different "tiers" of WAFL (a higher level abstraction than "raid group" or "disk"), and cache on cache architectures are much more effective for this than having the array decide which blocks are hot for random read workloads. Since Netapp arrays don't really have much in terms of write performance issues, this auto tiering doesn't do much for this corner case that might exist in a block environment.
For Netapp tiering just get FC or SAS and configure those as caching volumes. (you can do this in the same controller, or an outboard cache)
Next, for a VMDK farm of 8,000 VMs that live on SATA simply pre populate the cache tier by let it warm up naturally or if you're in a hurry run ls -la across the directory tree.
Let PAM handle the rest.
2/3rds of the above technologies have been around since 2002 or prior, adding PAM just makes it go faster and adds an additional "tier". Granted, this assumes you're running vmware over NFS, but it makes more sense to me than having 25 different "tiers" of raid types and RPMs
My last comment... I swear.
/goes back to kicking his HDFS cluster