Commodity servers running big CPUs with fat cores are not necessarily the best at running the Hadoop. Just ask the bunch of customers who have bought Atom-smasher micro servers from SeaMicro to crunch their big-data workloads. SeaMicro has been peddling its SM10000-64 micro server, based on Intel's dual-core, 64-bit Atom N570 …
the first thing I ask when I see huge workloads is...
... stuff the hardware, are the algorithms any good? I've been amazed at the brainless approaches to solving problems taken by companies throughout my working life.
I would make a reasonable bet it could be done in less hardware. Am prepared to back that up too.
You post anonymously so that no one can take you up on it...
if eHarmony are reading this and want to take me up then
we can get in touch one way or another. Not that tough.
to see ARM architecture enter this market.
It would require a major re-archecture by ARM, their cores are very good but even MIPS out run them.
Naturally where power is limited ARM win sicne they provide more speed when there is little power available.
I have little real-world Hadoop experience but...
...isn't this just telling us that a solution designed for a problem is better than a general purpose cloud?
My limited experience with Hadoop shows that it is very sensitive to RAM and I/O bottlenecks (both disk and network).
If the off-the-shelf servers aren't as well matched in RAM (16 threads for 8GB RAM verses 4 threads for 4GB of RAM), disk (I'm unsure how the SM10000-64 presents disk to the compute boards - is it as one large pool, or multiple smaller pools) and network I/O (my Hadoop tests easily hit 1Gbps).
i.e. how would the Xeon servers have faired if there were fewer of them and they used more RAM (say 2GB/thread), more local disk and bonded 1Gbps network adapters to reduce the bottle necks.
Management of the SeaMicro box will probably be easier.
Yeah, but can it run Crysis?
Some memes are too good to die.
The author does not know jack about Hadoop.
Sorry not to rain on SeaMicro's parade but the benchmark isn't necessarily apples to apples in comparison.
I don't know specifics about the job,but your cluster design is going to be based on what sort of job you want to run.
The article is a fail because the author doesn't know Jack, and the benchmark is a bit stacked in SeaMicro's favor.