Store data in EMC Centera and lose it - that's the claim of NetApp blogger Val Bercovici, and he cites a Symantec support document entitled 'Archiving items in Enterprise Vault may result in an extremely rare data loss situation' to prove it. Bercovici, who works in the office of the CTO (Chief Technology Officer) at NetApp, …
A bad NetApp move
Absolutely; NetApp was irresponsible in their crowing about Symantec's bug -- they're a Symantec partner too. The only anti-Centera nugget of truth that can be extracted from the whole affair is the point that Centera uses a proprietary and baroque API, and so applications written to it are likely to be less fully tested and more prone to error. Even that is rather tenuous. But, better to use standardized interfaces like NFS and CIFS, or in the future XAM.
Does the Reg use Centera? 2nd attempt at comment :)
First of all - thanks for bringing MUCH needed attention to this whole issue. I also commend your attempts at objectivity.
However, as they say, the plot THIKENS! :-)
As you may be aware, the original title of the Symantec KB Article is:
"Archiving items in Enterprise Vault to an EMC Centera may result in data loss."
I'll leave it as an exercise to the reader regarding what "behind the scenes" activity inspired the change once I highlighted it on my blog :)
Regardless, the simple facts remain that this newly revised KB article still shows one, and only one archiving platform vulnerable to data loss - EMC Centera.
There is no Symantec (or any other popular archiving ISV for that matter) KB article warning of potential risks with archiving to NetApp SnapLock. Will there every be? Who knows, but I like NetApp's odds due to one key difference - SIMPLICITY.
Call it what you will, but the EMC Centera API is a huge and complex beast to work with. NetApp's (optional) SnapLock API is the model of simplicity by comparison and is often unnecessary since nearly all archiving vendors support direct filesystem (based on standards) access anyway.
It is a gross oversimplification to correctly label both Data ONTAP and EMC CentraStor as complex pieces of software - yet then conclude that their resulting levels of data integrity are similar. Especially when a an empirical Google search will yield many examples of data loss with one and none with the other.
I personally find it fascinating that some companies (such as Procedo) have built entire practices (i.e. PAMM) around migrating data away from EMC Centera onto safer platforms.
I guess where there's smoke...
Office of the CTO, NetApp
Consider this FUD attack "Exposed"
Val said: "There is no Symantec (or any other popular archiving ISV for that matter) KB article warning of potential risks with archiving to NetApp SnapLock. Will there every be? Who knows, but I like NetApp's odds due to one key difference - SIMPLICITY."
Continuing the pattern of not doing any research and hoping no one will notice I see.
Lets look at Symantec KBase article 316205.
"Data loss may occur when two or more Vault Store Partitions are located on the same physical location..."
"The following scenarios do not perform a check and will allow the same location to be created:
When a NetApp is used, this check is not performed."
Note: This does NOT affect an EMC Centera device in terms of data loss."
You've been on the wrong side of this since the very first moment Val. I told you as much in the first comment on your post where after paragraphs of FUD you still couldn't identify the problem with Enterprise Vault not archiving the data to Centera, even though it was all in the advisory if you understood what you were reading. I explained it in what? Two sentences?
What I lack in word count I make up for with accuracy and hands on experience.
But one must ask is this what goes on in the NetApp CTO Office? They trot out partner's tech advisories in public, read them incorrectly and when the partner publicly says they're wrong they then claim a Google search is proof enough as in the absence of verifiable fact they now have nothing else to support the opinion they're pushing?
Regardless of that, none of the verifiable facts (Verifiable by a third party you dragged into this) support your initial or current opinions Val leaving us with this ongoing FUD campaign of yours.
The industry equivalent of Quixote tilting at windmills.
There is no data integrity issue with Centera, you made it up. Now having caused a public incident with a partner you appear to have moved on to embarrassing your employer.
Backup, Recovery and Archiving Subject Matter Expert. EMC Corporation.
Technical facts now available
Detailed technical facts about the Enterprise Vault data loss bug are now available over at The Storage Anarchist's blog.
It should be noted these are verified facts and not opinions of a competitor with a FUD agenda.
Straight from the EMC Playbook
Hi marc / Barry,
Classic maneuver, right from the EMC playbook. Personalize the discussion and attack the whistleblower to distract from the facts.
Thanks for playing along:
Val, you're an idiot
You picked the right company to fight with, but completely the wrong topic to fight about. Not sure how you'll recover from this one.
Flames, because you crashed and burned.
The Exposure Continues
Hello Coward and other commenters,
Please do keep the comments coming! My goal is to add exposure to the key topic of compliance archive data integrity, not to win tete-a-tete battles over 3rd party knowledgebase semantics.
Transparency on this topic is very important to me, and I've decided putting up with online abuse is a small price to pay for the increased customer trust this exercise will result in once disturbing veils of secrecy around EMC Centera data integrity are finally removed.
Wrong and Right
Chris, you're on the wrong side of this issue, but for technically innocent reasons. Val has exposed a malicious attack scenario (involving user-generated MD5 collisions) which archiving developers like me had never accounted for back in 2003/2004 when we developed our initial integration with the Centera API.
Bottom-line, all of the early archiving implementations on Centera upto version 1.2 (including KVS, IXOS, iLumin, EDUCOMM, et al) are vulnerable to this data loss scenario because EMC configured collision detection OFF as the default in order to enable the popular Single Instance Storage (SIS) feature.
That means this troublesome Symantec KB article has relevance on the Centera side of the equation, not just the EV side.
See my latest update on why and how:
The plot thickens indeed!
Vinanti (or should I call you FemmeFatale?) - thanks for chiming in here (and on my blog) with relevant objective technical detail!
This is precisely the kind of background info that explains my position against EMC's opaque stance regarding this issue. True to form, EMC's bloggers are now busy shutting down comments on their related blogs just as EMC's PR people did years ago when this Centera silent data corruption issue was first exposed - then covered up by the IT media.
Unfortunately, it's the innocent EMC Centera customers and archive software partners (like Symantec) that now have to live with this Archive Russian Roulette scenario. They'll never know what data went missing forever until they try to retrieve it.
For all those who used the default EMC Centera configurations of collision detection OFF with SIS, I strongly recommend following the "Next Steps" listed on my blog -