Yah, <strike>NSA</strike>Facebook, tell us another
Yah, NSAFacebook, tell us another - we know where you got the data-mining algorithm from.
Facebook has revealed a query engine for data warehouses that blows the doors off Hive, and plans to publish it as open source this year. The "Presto" technology is a query execution engine built for Facebook's vast data warehouse. It was announced on Thursday at a data analytics conference hosted at Facebook's HQ in Menlo …
That's what first got me into apps development. You can easily make an "SQL-like" query to your user-id to pull out (say) all posts. Then a simple 'grep' on the data, assuming you don't need to automate it.
posts/links/videos/photos/comments/status-updates etc.etc. all have their own table you can query. It's such a basic function I am surprised it isn't wrapped into a user-search feature as standard..
It is about who looked at who's cat photos.
You know, like who has made a call from a phone number to another phone number that is associated with another person. For instance, imagine that the US government secretly gathered the records from phone companies about all the phone calls that were made. Sure, that's far fetched, but hypothetically....
I would be interested to know what kind of global queries presto can answser and what or for whom their relevance is.
It seems to me most of the interesting queries will be on super small dataset (e.g. a user's dataset or a user's group dataset) except perhaps for advertising datamining although even here cat's videos and photos won't be taken into account (as long as major new photo/video search technology/indexing) which should take care of most of the 250PB.
Has somebody been time-tripping back to the eighties to get the “news” angle on this announcement, the only difference is Exadata instead of Teradata (and that is news about disk capacity not software innovation).
It’s good that facebook is going to open-source its SQL-engine because the competition will drive down the cost of real SQL-engines, and after a few upgrades it might be good enough for “Tesco Scale” (One RFID tag that tracks a fresh lump of meat back to the abattoir is worth a million photos of spotty teenagers).
“facebook scale” is more like land-fill garbage dump scale.. breath-taking in scale, but few would cry if they just disappeared.