Does it make sense to use shared storage with Hadoop?

This topic was created by Federica Monsone .

COMMENTS

This topic is closed for new posts.
  1. Federica Monsone

    Does it make sense to use shared storage with Hadoop?

    Question for anyone deploying Hadoop:

    the typical HDFS deployment is with local drives inside the compute nodes. However, as more data is stored on the cluster, are there any advantages to shared storage? When, if ever, does it make sense to use shared storage with Hadoop?

    1. @jgilmart
      Thumb Up

      Re: Does it make sense to use shared storage with Hadoop?

      As Hadoop gains traction, we've noticed that data growth in HDFS often exceeds compute requirements. Scaling out compute and disk as a single unit can lead to stranded (and thus wasted) CPU cycles. Many organizations are now looking for ways to scale compute and storage independently, often with very dense compute infrastructures, without sacrificing the simplicity, performance, or cost profile of local disk. Plus, shared storage can lead to reduced recovery times in the case of compute failures and reduce the replication overhead.

      With Ethernet SAN, we can provide the independent scaling and data protection without sacrificing simplicity or performance. We use commodity hardware, a parallel scale out design, and a simple interface to provide all the simplicity and performance of DAS but with the benefits of shared disk. As Hadoop becomes increasingly adopted in the enterprise and data ingest volumes grow, I expect the push for this kind of "virtual DAS" model will grow.

      -John Gilmartin, Coraid

This topic is closed for new posts.