Hadoop and NoSQL are the technologies of choice among the web cognoscenti, but one developer and technical author says they are being adopted too enthusiastically by some companies when good 'ol SQL approaches could work just as well. Ever since a team at Yahoo! did their turn at being prometheus and brought Google-magic down to …
Good perspective.. but...
Cannot agree with you more.. At the same time cannot disagree with you more either. I am SQL fanatic and love RDBMS. But, I have to agree that No-SQL brought in a plethora of database "types" like key value store, graph databases, document databses, XML databases. It was paradigm shift from modeling on data persisitance to modeling on data usage. No-SQL databases have got very little to do with size of storage and more to do with applications and even more to do with talent within the organization , openness to a better programming paradigm and of-course performance requirements.
Having said that, I totally agree that going to Hadoop for a few terabytes of data is an overkill. More than the hardware, it's the problem of expecting your existing data analysts to start thinking map-reduce and ending up losing the in-house talent in the process.
And, finally, if you are definitely going big data, I would ask the data analysts/scientists/DBA who will code on the platform and eventually maintain it and derive business value out of it to spend some time on HPCC Systems and the ECL programming language before making a decision. I tried it and I love it and I find it as a good entry to big data with the least change to your mental make-up. I feel it augments your SQL skills instead of killing it.
Very insightful article Jack. One other open source technology to mention is HPCC Systems from LexisNexis, a data-intensive supercomputing platform for processing and solving big data analytical problems. Their open source Machine Learning Library and Matrix processing algorithms assist data scientists and developers with business intelligence and predictive analytics. Its integration with Hadoop, R and Pentaho extends further capabilities providing a complete solution for data ingestion, processing and delivery. In fact, a webhdfs implementation, (web based API provided by Hadoop) was recently released.
More at http://hpccsystems.com/h2h
Still life in the old dog yet
We've a large dart mart running on Sql Server 2012 and I have to say those new xVelocity ColumnStore indexes are blisteringly fast.
Took 3 minute reports and turned them into sub-second responses.
Still got a few people who ask why we've not gone NoSQL yet. These, of course, the same people who don't have a clue...