HP used its Discover event in Vienna to both broaden and deepen its core storage portfolio, strengthening its file and deduplication offerings to compete better with EMC and NetApp. The headlines are: High-availability B6200 StoreOnce Backup System deduplicating backup to disk system for enterprises; Data Protector backup …
I've been testing one of the smaller StorOnce boxes for the last few months and it's a really excellent piece of hardware/software. They do well to hype the dedupe ratios, obviously it depends upon which backup packages you're using, the specific data and the configuration, but I have seen it dedupe data at about 2:1 from a backup package's dedupe disk store. Pretty impressive and very nice to see the hardware go up from SME to proper enterprise size. It also has pretty good replication, essential for enterprise.
NB: I don't work for HP or any reseller.
This represents a real step forward for HP in storage, after years of lagging EMC. A few observations:
1) From an external perspective, this is still a disjointed portfolio strategy (3PAR, Ibrix, StorOnce, Sepaton etc).
2) Doesn't seem very SMB friendly with starting prices of $250K and $30K+
3) Officially, NetApp has been left on the side of the road on dedupe. Dedupe from Avamar+Data Domain has been a 50%+ y/y growth business, with BRS approaching or exceeding $2B at EMC. I did notice a NetApp partnership with Arkeia Software today, so maybe they plan to get into the dedupe afterall.
1.You say disjointed but in fairness most vendors have a separate block storage offering, file storage offering and backup appliance. Yes, EMC and NetApp have 'Unified' offerings (to a greater or less extent) but they always come with compromises. There are Ibrix gateway products, for example, that can sit in front of 3PAR arrays to provide file access. And there is convergence in terms of the underlying hardware - P4000, X9000, X5000 and StoreOnce all share fundamentally the same hardware. It's not a huge leap to get to stage where a device looking something like a B6200 could be carved up, and re-carved up, to be an iSCSI SAN, a filer, a backup device, etc so you have a 'lump of storage' that you give whatever combination of 'personalities' fits your requirements at that time.
2. The B6200 starts at $250k but there are existing models for the StoreOnce family that remain in place that start from a couple of $k. The B6200 is a new product sitting at the top of the existing family.
3. NetApp's de-dupe has never been that elegant. If you really had the crown jewels, would you give it away? And despite getting it for free, most customers still aren't using it.
"I did notice a NetApp partnership with Arkeia Software today, so maybe they plan to get into the dedupe afterall."
Huh? That's odd, NetApp pretty much brought dedupe to the world's attention.
But I agree with you about the disjointedness of HP Storage.
Whose storage portfolio isn't disjointed to some extent? Everyone has a range of products to fit a range of requirements. If anything, HP's is less disjointed as at least the vast majority of it shares a common hardware platform. Even NetApp, who for all their flaws did have a simple portfolio, have complicated things through their acquisition of LSI. And if NetApp were really the trail blazers in terms of de-dupe, why were they so keen to get their hands on Data Domain. NetApp has single instancing ... which you get with Windows Storage Server as well. It's really not that clever!
I could never describe netapps dedupe as 'elegant' because it has too many limitations. I tried it on a 3TB VMWare backup area and it was fine. Grew the area to 4TB and SIS just said 'Too big, I'm sulking'.
end of life for the mechanical drive
It would be one of the biggest achievements once the technology/money is available to end the life of the mechanical hard-drive.
It is the Achilles heel of computing, being the slowest part of the equation.
With the economy floundering most companies are not investing much on R & D unfortunately.
The sustained I/O of solid memory is unparallelled in performance and speed. This will revolutionize IT eventually like a new awakening, it is as if mechanical drives will never meet an end. It is like dial-up Internet services it is time for it to be eliminated.
Netapp, HP, etc. miss the root of the problem
Netapp SIS is good for backup, where all the duplicate data resides. As is generic dedupe technology in general. However, if you are honest, you need to ask the question: Where does all this data come from that I need to backup? Right, it is primary storage and here mostly unstructured data, the fastest-growing segment of all. SIS and other dedupe technologies cannot effectively handle unstructured data, dedupe ratios on unstructured data sets are poor - why?
a) unstructured data represents pre-compressed content
b) unstructured data is similar, but not identical
c) unstructured data is highly active, moves a lot
The root of the problem is the unstructured data which accumulates on primary storage. You need to tackle this problem as early as during file creation and you will see that these primary storage reductions result in less backup requirements as well...
I wonder why companies like HP, Netapp do not realize that dedupe is just treating the symptoms in an effort to sell always more expensive, bigger machines to customers instead of offering a true, working solution....
Chris Schmid, COO balesio AG
"true, working solution" ...
... tends to be what the customer has already.
They're just short of storage. Hence trying to get more storage for less-more money ...
I'm in full agreement with you that in this there's shortsightedness and an a-priori approach to application / workload design which structures data and avoids "copy&paste-referencing/subclassing" can easily bring down storage / bandwidth needs by orders of magnitudes.
Unfortunately, many software stacks are "working" but are old and rigid; retrofitting a profound architectural change such as this into existing software is, not always but very often, either so daunting or so expensive as to be prohibitive.
Structured data, in that sense, is not necessarily using less storage / does not necessarily dedupe better. XML is a curse, really; copy & paste an XML file into another shifting it around by a few bytes in the process, and the dedup potential is gone. The usually-identical console logs from a server bootup are preceeded by unique timestamps/hostnames and again, the dedup potential evaporates. Just as examples.
These problems notwithstanding, storage that compresses and/or deduplicates (if only the twenty copies of the renamed CEO powerpoint memo which got stored into the DMS by twenty different departments) provides savings, and therefore has its place.
These savings are not as great as the ones realizable from a "context switch", but very tangible and achievable at significantly less risk. Like, treat a cold with lots of camomile tea instead of a 1000$/dose not-yet FDA certified breakthrough antiviral medication with as-yet-unknown side effects. Treat symptoms not cause. One of those cases of "good enough" ?
I never thought of data in this aspect but I think you hit the nail on the head.
It is putting a band-aid on a bigger problem.
I think data manipulation is the hardest of all task as a system admin.