Re: Does it make sense to use shared storage with Hadoop?
As Hadoop gains traction, we've noticed that data growth in HDFS often exceeds compute requirements. Scaling out compute and disk as a single unit can lead to stranded (and thus wasted) CPU cycles. Many organizations are now looking for ways to scale compute and storage independently, often with very dense compute infrastructures, without sacrificing the simplicity, performance, or cost profile of local disk. Plus, shared storage can lead to reduced recovery times in the case of compute failures and reduce the replication overhead.
With Ethernet SAN, we can provide the independent scaling and data protection without sacrificing simplicity or performance. We use commodity hardware, a parallel scale out design, and a simple interface to provide all the simplicity and performance of DAS but with the benefits of shared disk. As Hadoop becomes increasingly adopted in the enterprise and data ingest volumes grow, I expect the push for this kind of "virtual DAS" model will grow.
-John Gilmartin, Coraid