* Posts by akekre

1 publicly visible post • joined 21 Feb 2017

You're doing Hadoop and Spark wrong and they will probably fail

akekre

Big data needs to be operationalized

Hadoop eco-system, specially MapReduce and now Spark has always had an achilles heel. It has consistently ignored operational aspects (https://goo.gl/QGpRWe). TCO (cost of ownership) and TTV (time to value) has been impacted. Too many failures; add to that the propensity to try to do everything for everybody. SQL issues are consequence of that.

Demanding high competency is effectively admiting failure. It has been over 10 yrs. Big data is neither operationalized nor productized today. cloud helps somewhat, but the software has to hold up on its own. Big data is hard as is, the last thing it needs is over-promise and under delivery.

@ datatorrent.com, we are very operational focused, and have baked in a lot of these thoughts in Apex. Do try out, specially for ingestion and ETL.