Facebook has revealed a query engine for data warehouses that blows the doors off Hive, and plans to publish it as open source this year. The "Presto" technology is a query execution engine built for Facebook's vast data warehouse. It was announced on Thursday at a data analytics conference hosted at Facebook's HQ in Menlo Park …
Yah, <strike>NSA</strike>Facebook, tell us another
NSAFacebook, tell us another - we know where you got the data-mining algorithm from.
Technologically very cool
But please don't try to make "FB scale" some sort of metric.
That's one immense database
And I wouldn't like to be managing it on the day something goes a bit pear shaped
Re: That's one immense database
Could not read disk. Abort, Retry, Fail?
To the tape library!
Maybe they can offer up some of this superscale search to it's users?
I know an FB friend posted something about a poet named Robert a while back, can I find the post in question?, can I buggery.
Agreed, I'd love to have a way to
victim girlfriend for her routine lifestyle and vulnerabilities things we'd enjoy together.
The urban camo trenchcoat with the chloroform in the pocket, ta.
re: "post searching"
That's what first got me into apps development. You can easily make an "SQL-like" query to your user-id to pull out (say) all posts. Then a simple 'grep' on the data, assuming you don't need to automate it.
posts/links/videos/photos/comments/status-updates etc.etc. all have their own table you can query. It's such a basic function I am surprised it isn't wrapped into a user-search feature as standard..
It's not about the cat photos...
It is about who looked at who's cat photos.
You know, like who has made a call from a phone number to another phone number that is associated with another person. For instance, imagine that the US government secretly gathered the records from phone companies about all the phone calls that were made. Sure, that's far fetched, but hypothetically....
So the famous Opera connection...
...was just Facebook buying the "Presto" NAME?
I would be interested to know what kind of global queries presto can answser and what or for whom their relevance is.
It seems to me most of the interesting queries will be on super small dataset (e.g. a user's dataset or a user's group dataset) except perhaps for advertising datamining although even here cat's videos and photos won't be taken into account (as long as major new photo/video search technology/indexing) which should take care of most of the 250PB.
Back to the Future?
Has somebody been time-tripping back to the eighties to get the “news” angle on this announcement, the only difference is Exadata instead of Teradata (and that is news about disk capacity not software innovation).
It’s good that facebook is going to open-source its SQL-engine because the competition will drive down the cost of real SQL-engines, and after a few upgrades it might be good enough for “Tesco Scale” (One RFID tag that tracks a fresh lump of meat back to the abattoir is worth a million photos of spotty teenagers).
“facebook scale” is more like land-fill garbage dump scale.. breath-taking in scale, but few would cry if they just disappeared.
So much data, so little worth
What a titanic waste of resources.