It sounds like big data proponents need to attend an introduction to statistics course.
Statistics are slippery beasts and need to be handled with extreme care, otherwise you risk getting completely the wrong answer.
Big Data, the shiny happy story goes, will let governments direct resources into programs that really do make a difference to the problems society faces, resulting in better services, less waste and grins on every face. But Australia's Bureau of Statistics (ABS) has just published a Research Paper A Statistical Framework for …
Sometimes data is needed first in order to understand what is possible. Why limit ourselves to the imagination of the chief architect?
Once the data is available it can be viewed from multiple statistical viewpoints for various stakeholders, who, from experience, will only "see the light" once it's in front of them.
Correlation may not equal causation, but quite a few times it does and with some pier reviews the assumptions can be vetted.
No data will give you no information.
Once you do get data, for example room temperatures, you can start asking questions like "what is the average temperature of this room?", "why do I get a spike at 6 am every week day?" and most importantly "what happens if I change this?".
is not how accurate big data/data science could be (because surely the use of it will include errors), it is about the improvement over existing decision making processes. Because lets face facts, most politicians who make decisions have no evidence base for making them (which is kind of OK as they also don't have the analytical understanding to use the evidence base if it existed).
So do you want to try and answer 'business questions' - which exist aplenty in Government - with data, big or otherwise, or do you want to answer them with opinion, voodoo, or whatever other technique happens to spring to mind?
The client wants to just dump everything into a cluster and for insights and improvements to the government to come out the end, magically.
This document is going to be really useful to help educate the client on the cost effectiveness of data insights, but also to form a practical guide for the project & training handover so it actually can try to deliver the magic. I've already begun to incorporate the idea into the overall project structure and added a few extra bits.
I'll know in a couple of months if it pays off and there's any awards to be won.
Anon for fear of big data failure.
bloody unlikely tricky bit is ensuring the politicians ask unbiased questions.
Take a look at the unemployment figures and methods of definition from the 1980s to present. Each political party has narrowed the definition of what is an unemployed person in the UK. 1980 the definition was anyone not in full time employ, a housewife or a school aged child. Now the definition has so many exclusions that you could be unemployed but classed as a job seeking employee of the DSS.
The problem with big data is it shows what happens naturally and the people who want to use big data, do not want those outcomes, they want other artificial outcomes and they thing by distorting the patterns shown by big data they achieve the outcomes they want, "UNLIMITED POWER".
All the create of course is distortion and chaos, now if that is their goal, I suppose fine but to achieve chaos you certainly do not need big data, if fact chaos thrives on no data, just a readily as corrupted big data.
Having been to three GovHack events now where the ABS data has been almost entirely inaccessible, I think the basic problem is that the ABS think they own the data and letting other people get their hands on it gives them the screaming heeby-jeebies.
Their "data portal" allows you to query to your heart's content, but then delivers a "CSV file" with 3 rows of consolidated header fields, with commas used as a thousands separator in all the number values. Utterly impossible to use for anything outside of entry-level spreadsheet stuff.
I would understand, except this is 3 f****g years they've had to get it right. Other departments, with ostensibly less experience about their data, have managed to get this right in one year after having the problem explained to them in short sentences.
Doesn't matter what the bureau of statistics says. The CIO needs big data so he/she can be part of the conversation at the next conference where they can discuss their visions.
As we all know Information Technology is not just about business processes and communication anymore and these days IT needs to "partner with the business" to add value. While CIO's at high tech companies or Election campaigns have some demonstratable success with 'big data' - the CIO at your local cornershop wants to emulate that succcess.
This requires a high level of 'agility' to jump onto the technological bandwagon - and quickly jump off and distance oneself if it turns out to be a dud.
Deriving information and knowledge (or business intelligence) from data has always been done. Big Data will help you to drive innovation and find solutions to tomorrows problems... Genius Sales.
On the upside, as storage capacity grows - big data - will just become -data- again. And as CPU power icreases you wont even need Watson anymore. Every idiot will be able to gain some sort of insight from any data source he/she can get her hands on.
In my personal opinion "Big Data" is just a stupid ambigious term - so IT vendors and customers can sit on the same table, have a conversation, but take about completely different things - without even knowing it.
Biting the hand that feeds IT © 1998–2019