back to article The blessing and the curse of Big Data

Companies more familiar with technology are more likely to use the reporting and analytics features of their software. This isn't something new, and it didn't start with computers. Computers make reporting and analytics easier, but every business needs hard data if they are to grow. Back in the day, "reporting and analytics" …

  1. F0ul

    Loved the article - but ..

    The article really highlights the real problem with enterprise IT systems, People and their lack of knowledge.

    They know what sort of tablet they want, and what sort of software dashboard they want, but have no interest in how the system is developed, or even what its capabilities are out of the box - they just want it to work, when they use it. The problem is, nobody is willing to take the can for ensuring that everyone else knows - especially beyond their department.

    The outcome is that nobody is aware of how the data fit together. Each process is regulated, as is the hardware, as is the software, but software configurations is not - so it slowly strangles the company.

    Too much of a good thing, is a bad thing!

  2. sarnath

    Simply loved the article for giving a perspective on history.

    However I feel that software system integration is different from big data.

    I think web services solve this problem enabling say a shopping website to talk to your bank and so on. If that can communicate, so do disparate business systems.

    But I don't see how they fit under big data..

    All said., this is a brilliant article.

    I created an account with register(normally I don't) only for writing this comment.

    Keep up the good work.

    1. Just Enough

      Poor Example

      Have to agree. The example used is a basic systems integration story that could have been written 20 years ago. That ain't big data.

      But I can't find fault with the over all message of the article.

  3. bigbob

    Erm this is "data", not "big data". Yes I know fewer people will click on this article if you rename "whats the point of systems integration", but maybe the simple truth about El Reg hurts.

  4. Trevor_Pott Gold badge

    "Data" versus "Big Data"

    The difference between "data" and "Big Data" is not cut and dried. Most agree it's a matter of scale, but where the line begins and ends varies wildly. This is, I think, a key point.

    What seems like merely "data" to one organization that is good at systems integration can seem like "Big Data" to another. And a Big Data dataset, once tamed and understood and become "data" within a single refresh cycle. (Especially if you can bring GPUs and NVMe SSDs to bear on the problem!)

    The issues that plague Big Data are identical to the issues that plague traditional systems integration and "data": you need to know what you want from your data before you go forth and create systems to achieve it. Merely collecting all data points under the sun is worthless. You need a goal in mind. You do not simply store bits and bytes on hadoop and *poof*, your company is magically saving money.

    Information captured without purpose provides no benefit. Regardless of the size or scale of the data in question, and the purpose of that data is just as often automation as analytics. Indeed, anyone who thinks Big Data stops at providing the raw resource for human-readable analytics has failed to learn from history! Once we've managed to turn large quantities of unstructured data into something a human can understand from an anayltics standpoint we can then start acting on that information in an automated fashion.

    Big Data inevitably becomes "just data". No matter the size of the dataset, it inevitably drives automation.

    Now, I'm happy to argue the point with anyone willing to put forth an exacting definition of the difference between "data" and "Big Data" that doesn't rely on the underlying technologies used. (Just because you use Hadoop doesn't mean it's Big Data, etc.) And that definition should be one you're willing to put your real names to, and one most practitioners in the field would get behind. Oh, and make the definition one that will forever separate Big Data from data...even as the march of technology moves on and terabyte or even petabyte datasets become commonplace and easy to plow through.

    Lacking such a concrete definition I'm going back to my original one: the difference between "data" and "Big Data" is in the eye of the beholder, and the questions about how to use both categories of data to benefit a business are usually the same.

    1. bigbob

      Re: "Data" versus "Big Data"

      I disagree - the spirit of big data is not about really about scale, despite what the name implies, and the big-corporate-influenced bilge on the wikipedia page and boring-but-rebranded systems conferences. A huge organization's accounts over a century might take a few terabytes but it ain't Big Data. We've had enormous environmental and weather data for decades and that wasn't called "Big Data", and should not be part of this new category because it simply brings no innovation or anything interesting to discuss by anyone outside the Met Office.

      For me, this the Big Data that has captured everyone's imagination:

      * collecting pervasive data. It's not about the mission-critical data that has been traditionally collected for decades, such as stock, sales and employees. It's about collecting data on every customer click, every employee footstep, the temperature in every room in the country, etc.

      * far-reaching linking of datasets, e.g. traffic accidents linked with personality data, or google search terms scanned to discover flu outbreaks

      * the analysis is likely to involve recent strides made in machine learning, more than 3 dimensions, natural language processing, etc.

      Cynically slapping the words 'Big Data' onto an article about 1980's SI is an insult to the really exciting work going on in Big Data in the past few years.

      1. Trevor_Pott Gold badge

        Re: "Data" versus "Big Data"

        I accept that to you, personally the narrow definition you've espoused is what you, personally consider Big Data. That said, there are thousands of fairly influential people in our industry who disagree with you.

        You are essentially arguing semantics about a marketing term that was long ago coopted. It's like trying to say "cloud" means "X and exactly X". That's bunkum. Cloud - like Big Data, Software Defined Storage or any number of other marketing terms - means essentially nothing. Like it or not, "Big Data" has become a catch all term that encompasses everything from analytic to automation to novel data mining.

        I recognize that you have an emotional attachment to a specific definition of Big Data which has, as you put it, "captured your imagination". But I must humbly submit that what you are talking about isn't Big Data. It is data science.

        Data science is a discipline related to but not limited to some aspects of Big Data. Similarly, not all things which fall under the moniker Big Data are relevant to data science. The buzzwords have evolved. The marketing people took over Big Data ages ago.

        And no, you can't fight it. You can't be a definition hipster. You can't single handedly change what everyone is going to mean when they use a term. The tide of marketing in tech is simply too powerful. It will defeat your preferred definition of Big Data as surely as it defeated me with cloud.

        So use a new term. Until that one gets coopted. Then choose a new term. And another. And another, and another, and another.

        Welcome to the terminology rat race. Life sucks and then some fish eat you.

    2. Just Enough

      Re: "Data" versus "Big Data"

      "Information captured without purpose provides no benefit."

      Not true. There are plenty of historical examples of information being captured for no real purpose, at the time, but then later proved to be very useful.

      Case in point; a pensioner who recorded for years every time he cut his lawn. A pointless collection of data that now has some value. http://news.bbc.co.uk/1/hi/scotland/4210978.stm

  5. Robert D Bank

    WTF would I know...

    'Big' data seems to be about the aggregation of vast and disparate data sources to facilitate improved prediction of behaviours and calculation of probabilities by revealing otherwise unrecognisable relationships and patterns.

    The scale and scope of data has increased exponentially in the last two decades. Not just data being currently produced but also historical data being digitised and brought into the equation. Software and hardware advances now make processing it a more practical proposition.

    One issue is the accuracy of the data which may vary across different sources, although you'd hope and expect this would be 'cleaned' and verified in the datasets over time. But if the dataset is used for example in decision engines in real time for credit scoring or anti-fraud purposes it could be devastating.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like