It's the Google equivalent of the everlasting gobstopper. And for some reason, the Mountain View Chocolate Factory has encouraged a knockoff industry among its Slugworthian rivals. Considering the code of secrecy that typically envelops Google's internal operations, you have to wonder why the company helped foster the birth and …
Gimme a break...
Why the heck you twist everything and throw it back at google in a bad way I wont ever understand...
Good article but since after I saw your interview in 'crancky geeks' and how you hate/envy/wish/cry about google I have second thoughts about you being an impartial "reporter"... I guess you prefer the way Microsoft/Apple work. Never saw you complaining about Windows code base being open sourced or something like this...
Nice article though. (for real, not joking :P)
Open the box, Google
Google should really be open-sourced throughout, they are more of an infrastructure company, yes, someone tomorrow could copy the entire stack, load it up on a load of servers, but why should they care?
Maybe the infrastructure should be open so that anyone can run a Google server (with obvious rules about data tampering) thus not taking away from the brand, but having your own little Google on your adsl modem where you can set how much bandwidth it can use, they can still keep their services, maybe anyone can open their own search engine and share the revenue from advertising with their adwords system, just some ideas - maybe Google is listening :-)
More than a little better performance wise only
Google had benchmarks in the original papers that Hadoop MapReduce and HBase still cannot touch - who knows if they have tuned and improved on old figures - almost certainly. But that is not the point, if they had released their version it may have challenged Hadoop but since its all written in C++ and highly optimised to run their particular jobs I doubt it. Hadoop is in Java - a language even poor programmers like myself can handle - I have written MapReduces in Hadoop in minutes its so easy. Would I be able to do the same on the Google code base - I doubt it since I haven't touched C for over a decade and didn't like it then ;-)
So overall a pretty poor article as its fairly obvious why they are not scared by Hadoop - they have a dominate position partly due to the technology advantage they had (As in past tense). Everyone fighting over whats left is not going to dent their lead one iota while they are free to look for the next new technology.
What's that in Oompa-Loompa years?
"Trying to cleanly excise them would be a software engineering challenge that would take millions of man hours. There would be no clean way to cut it out."
Well, just one million man-hours is 41,000 days or 205 man-years (at 200 billable days per year after holidays and overheads).
Lets hope that he's fond of wild hyperbole, because if this estimate is even within an order of magnitude of the real effort, Google's codebase is the mother of all spaghetti bowls.
(I would imagine the truth is a combination of both, sadly)
So, students straight out of college/uni didn't have experince of multi terrabyte datasets? Duh!
Google had to send them on a course to learn about them. Isn't this traditionally known as training, although it tends to happen on the job rather than prior to it, as appears to happen at Google. I wonder if they got paid for their time?
What to open source ?
Google are hardly likely to open source their search algorithms. But they have as much interest in keeping proprietary secrets in how massively parallel processing is done as Microsoft have in how a transistor works. The advantage of open sourcing stuff that you have to do but but which is a cost centre as opposed to your cash cow is sharing the cost with others who have to buy into the same technology area.