World+dog agrees that Hadoop is a very fine tool with which to tackle map reduce chores, but the software has a couple of constraints, especially its reliance on the Hadoop Distributed File System (HDFS). There's nothing wrong with HDFS, but its integration with Hadoop means the software needs a dedicated cluster of computers on …
"“We abstracted out an HDFS layer but underneath that it is actually talking to lustre."
Err, Hadoop has a specific class/interface, FileSystem, designed to let anyone implement a filesystem underneath: local being a key one. All you have to do is implement it and then pass tests like FileSystemContractBaseTest to convince yourself you got it right.
While Intel make it sound like they did some heavy engineering "we abstracted out an HDFS layer", what they probably mean is they took the ASF-supported LocalFileSystem class and tweaked it to get locality information out of Lustre, then ran some (? how many?) tests to show it worked. Having them talk about the tests, that would be interesting. Ask them (or any other "we swapped HDFS for -" vendor) for that question, as only EMC/Pivotal have owned up to testing on a 1000+ node cluster
- 'Windows 9' LEAK: Microsoft's playing catchup with Linux
- Review A SCORCHIO fatboy SSD: Samsung SSD850 PRO 3D V-NAND
- Was Earth once covered in HELLFIRE? No – more like a wet Sunday night in Iceland
- Every billionaire needs a PANZER TANK, right? STOP THERE, Paul Allen
- Breaking Fad 4K-ing excellent TV is on its way ... in its own sweet time, natch