How do you mistakenly spend three years collecting personal data from the world's open WiFi networks? We're not quite sure. We can only hope that when Google asks an outsider to scrutinize the WiFi-snooping habits of its Street View cars, the results are released to the public. Google spent three years collecting personal data …
"The engineering team at Google works hard to earn your trust - and we are acutely aware that we failed badly here. We are profoundly sorry for this error and are determined to learn all the lessons we can from our mistake"
Are Google turning into Facebook? Their apologies for this sort of thing are starting to sound similar.
Can someone explain to me, as a person with a 40G hard one with only 6G used and the opportunity to store a whole shedload of data on it, how 600GB of data is in some way 'not a lot'.
"Certainly, the SSIDs and MAC addresses were eventually uploaded to the company's servers, and if the payload data was uploaded as well, this would increase the likelihood that someone knew it was there."
FFS!!!1109876 Or maybe I am overreacting.... but the cars were GPS enabled and taking snapshots of peoples houses as they hosed up this lot of data.
"Hi Tim, yes would love to meet you for some serious rubber sex. My place. Tuesday afternoon OK? Love James".
GoogleCameraClick- 58 Spong Road, BlartelyVille.
That's about 130 bytes uncompressed...
New Alien World Order
Privacy? A thing in the Past which is now long gone in the Present. Get over it and move on, please, for the Future is better in Betas with no dirty hidden secrets.
Please, do tell amfM ...
"the Future is better in Betas with no dirty hidden secrets."
How's the article on AI coming? Will we get full disclosure? Or are there dirty hidden secrets? ;-)
Who ... cares?
The real question is "who cares?" This is a totally over-hyped story. Even though it sounds like a lot of data, it was collected with the vehicles were moving. Even sitting at a stoplight the most it would be was a few minutes connected to each hotspot. So that means it's howevermany GBs of tiny little fragments of data, not a single stream from one source. Maybe enough to figure out 1 or 2 people's email address, but certainly nothing else. Google knows more about you from their ads showing up on every web site than they could ever get from this data.
Oh, by the way, it was only on *public* wifi networks. Even the weak WEP encryption would have thwarted this "attack". Anyone on these networks has no right to get upset about it.
Re: Who ... cares?
Googles business is advertising and martketing.
If a "market researcher came to your door, said that they had taken a few photos of your house and noted its location and then asked if you would mind if they snooped on any WLAN AP's they could see, would you mind?
And even if your WLAN data is encrypted (WEP/WPA/WPAv2), unless you use an EAP-based authentication scheme, unencrypting the information based on rainbow tables would be trivial for a company with the compute resources of Google.
Still, it's Google so everything is OK.
... in your world, not only would the insurance company fail to pay out because the burglary victim had inadvertently left his window open, but the burglar would walk free from court as well.
The first is quite reasonable. The second is dangerously stupid.
It's principle and precedence, stupid
Erm... whilst I would agree that the likelihood of the data collected being useful may appear to be small, it is the principle and the precedence that are important here.
Taking a small step back, what gives Google the right to drive around my neighbourhood and record the approximate location, MAC address and other information it can extract from my router? Or my neighbour's router? Or everyone's in my town? Or indeed the whole country?
Moving back to the data issue - I didn't give Google permission to interecept my internet traffic. Neither has anybody else who had data collected in this trawl. If nothing is done then Google might decide they have implied consent to continue collecting this type of data. How useful might it be to fit a data collection black box to every Google car, so that it automatically does this data grab as it is driven round on completely unrelated business?
Just because my data is out there, does not mean I want it collected and analysed. To do so in this country would probably contravene 'wiretap' related laws anyway (see: PHORM). The accidental collection of the odd bit of data here and there IS no big deal. The systematic collection of data across several countries certainly is a big deal.
"Can someone explain to me, as a person with a 40G hard one with only 6G used and the opportunity to store a whole shedload of data on it, how 600GB of data is in some way 'not a lot'."
Are you serious? Because your personal computer has 6GB of data on it, you can't possibly imagine that Google has more than that? Just think of the sheer number of miles those cars cover- it all adds up.
I almost didn't want to read this article. Not because I'm not interested in the topic- I am- but because El Reg has developed such a tiring hatred for Google ("Chocolate Factory" stopped being funny about twelve months ago) that I knew once a story of a *legitimate* beef with Google came up, the report would be written in a deeply smug "told you so" kind of way. I was not wrong.
you have missed the point
600G of video or even hi res photos from 30 countries is not a lot at all.
600G of compressed text is a monumental amount of data. It doesnt take much to capture packets, GPS location. Is it only 600G *after* uploading to googleHQ - i.e. after the chaff has been filtered out and only the juicy bits remain?
I for one welcome our new GPS enabled, data snooping, photo grabbing overlords. Oh, wait a min, thats already here in the UK.
Here's the truth
Google now have enough SSIDs with geo-location information in their database, that when you go to one of their websites in a browser that supports geo-location (which will be almost all of them soon), the list of nearby Wi-Wi networks your machine sends along to them will allow them to calculate exactly where you are, and ass YOU to the database too!
Five minutes later, this information will be available to buy/licence/use by other companies. In case you hadn't noticed, Google are a business, which puts them primarily in the business of making money. This kind of data is worth a fortune! Imagine being able to serve someone an advert for a shop, quite literally, around their own corner rather than just in their country or city.
It's all out of the bottle now, and even if Google are seen tp be deleting the raw data, don't think that means their databases no longer have the end result. As this article says, think about what they're actually saying, not what they expect you to infer.
Err... I don't think you realise that is pretty much how it does work and is supposed to work. I'm not sure what you are getting at.
If you go to a website, either Google or any other, that supports geolocation services and you are using one of the modern browsers then it will ask you if it can query your location. Your location will be determined by the Wi-Fi networks around you (in most cases, but could use GPS, IP address, mobile cells) from one of the geolocating service providers that have collected that information (Google, Skyhook Wireless etc).
This could then be used for showing your location on a map, local search results, local weather forecasts or local advertising etc.
I have noticed that my location shown on my phone on Google maps is now accurate to about 30m without GPS whereas before it was accurate to about 3Km. It is far more useful as I can get directions without GPS or, for example, when I look up cinema listings it automatically gives me results for my nearest cinemas. If you use a smartphone you probably get geolocated a lot.
With a browser you need to give you specific permission for each site - with a phone you just need to load the application.
As for the advertising - I don't think the information becomes available 5 minutes later, you would have left the page by then. They hold a database of adverts that want to target specific locations and they are shown immediately according to their secret algorithm. There isn't a Google monitor employee who rings up an advertiser, after you log on and give permission to access your location, who says "Eh, I've got a good one for you, this chap is in Royston Vasey, just down the road from you, do you want me to pop up an ad - cost you a fiver".
It could also be argued that a local restaurant advertising that they are doing a two-for-one deal would be much more useful than knowing you are the 999,999th visitor to a website and have won a prize.
If you are really that scared about your location, have you ever bought anything off the web? If so then you have probably entered you full address, telephone number, e-mail and credit card details. If you don't trust anyone then this data is far more accurate, valuable and useful to both advertisers and bad 'uns than a geolocation. That data could well be available for sale 5 minutes later.
I care about my privacy more than most people and I don't think anybody should be blasé about it, but unfortunately you do have to give up a little to use a lot of online services. A lot depends on trust. In the great scheme of things (despite using some Google services) the only spam I received on my main account was after a club membership list accidentally got e-mailed and the only stolen identity was after using a debit card in a petrol station which created a cloned card. I haven't yet, had problems from an online company.
Apples vs Bananas
I agree with the mechanics, however:
"If you are really that scared about your location, have you ever bought anything off the web? If so then you have probably entered you full address, telephone number, e-mail and credit card details."
Two very different things and very different risks. I spend a lot of time travelling and connect to the internet from a variety of hotspots around the country. Being able to geolocate my travels *is* something that would concern me.
Amazon knowing my home address is not.
Also, I notice that when I use google it frequently pops up a request asking me to allow it to geolocate me, with an option to remember my choice. I have tried to get it to remember my no, but it ignores me. I wonder if it would be so forgetful if I said yes?
I assume that google is using some form of GEO-IP look up which doesnt worry me too much (for example, on a Virgin Train, google thinks I am in Germany....) but if they are correlating the data with their own geolocation of wifi points it is much more troublesome.
To date, I have never needed my smartphone to tell me what restraunts or cinemas are within 30m of my location - looking around normally solves that.
I didn't say you needed to find things within 30m of where you are, you must have mis-read my post. You seem to think I said that it will only tell you where things are within 30m of yourself? If you want to get directions to somewhere then you need to first have a pretty good idea of where you are. If your location is within 3Km then directions are pretty worthless.
Similarly if you get dropped off by bus in a foreign country and need to know how to get to your hotel, or get to the local subway station. Then knowing a fairly accurate idea of your location is quite handy.
The way coding works it would be easy to no notice.
I've been programming since before Microsoft,Apple or Google existed.
It totally makes sense to me how the data was errantly collected and not
noticed for years.
First , consider that Google was attempting to create a database of geolocation
information that would allow one to determine their position based upon
nearby WiFi signals.
So what they were aiming to stick in their database was the device identifiers
broadcast by every WiFi device in every packet they send out, along with the longitude
and latitude where the signals were found with the street view GPS.
In this fashion , a cellphone , iPad or other WiFi enabled device could take those same
device id's all devices harvest and use them to ask such questions as
"where is the nearest coffee shop, rail station, hotel etc."
Since the device id's are in the headers of packets , and devices take packets as a whole
those packets come along with random bits of data.
Those of you who have experienced the frustration involved in the limited range of
your WiFi router , know such signals do not go very far.
The data we would be talking about Google having had collected would be limited
to the amount that their street view cars would see while driving by at speed and
being within range of your WiFi router.
Most programmers try and re-use code from past projects , or leverage
code from various pre-packaged libraries of code when creating new applications.
None of us likes to re-invent the wheel, when we can copy-n-paste ine
from our tool box.
The pre-existing code for processing WiFi packets would as a matter of necessity take
a packet as a whole. The programmer would then record the few random packets
returned by the code libraries as a whole onto the hard drive in the street view car
along with the GPS location where the packet was recorded.
Later back at the office, similar packet processing code would read the whole packet
and pluck out the device identifiers from the header and store those with the geolocation
data into a database to be used in answering those questions of
"where is the nearest coffee shop, rail station, hotel etc."
The random 512 or such bytes of junk data inside the random packets would have been ignored
To here this news item told, one would have you believe that the Google car sat outside
your house for hours siphoning off all your secrets, when in reality it would be more like
spinning the dial on your radio and hearing random utterances as the needle sped past the
There are firms out there that would benefit greatly by suckering government and citizens into
believing that there was more to this, but if you think about it , the data those cars would have errantly collected in the 1/2 second to 2 seconds they were in range as they drove by
your WiFi router at road speeds could not have possibly amounted to anything truly interesting
except to those who like to make something out of nothing.
Of course one might in realizing that every WiFi enabled device coming in range of
your home is gathering the same information as a part of it's normal operation, consider
taking the 1-2 minutes to turn on the encryption features in your WiFi routers
as unlike Google Street View cars , there are indeed others who would simply
park outside your house to snoop , or attach their computer to your WiFi to get FREE internet service with you left to foot the bill.
"I've been programming since before Microsoft,Apple or Google existed."
That's why you think AOL-esque icons are a good idea, and have no idea how ElReg comment formatting works. Small suggestion: Try to catch what you claim to have a doctorate in before posting further ... Ta.
Thats all you can come up with?
So your entire rebuttal to my comments is to the damaged
formatting that occurred during the registration process and
my selection of an icon from the provided list?
Well, I guess if you have no real position , then whatever
you type in lue of same must be at least appreciated for it's
entertainment value. That being the case , where do
I queue up for the refund ? :)
Nah. I was a printer in a prior life.
I absolutely HATE badly formatted documents, and generally refuse to read them (unless there is a buck in it for me). Not your fault, mind. It's a CR/CR-LF/LF thingie that (for whatever reason) ElReg can't seem to wrap it's BeakyHead around.
"where do I queue up for the refund ? :)"
The cheque'll be in your voicemail, as always. Don't go away ... you seem to have a sense of humo(u)r. That's rare around here.
N.B. I don't speak for, or type(o) for, ElReg.
too many newlines make baby jesus cry.
If you're going to post something unreadable...
... don't bleat like a schoolgirl when the only responses to it are that it's unreadable.
You're the internet equivalent of Billy Connolly's guy who "has 17 A-levels and a degree in brain surgery from NASA, but she'll always remember him as the guy who farted the first time he walked in the door".
I too understand how it could have been errantly collected, but that doesn't make it right.
I don't think the point it whether they collected too much data / delete it / keep it / ask a third party to intervene etc etc. Whatever man. That bit is irrelevant.
The important thing is that they deliberately set out to collect *any* data in the first place.
AFAIAC even if they just collected the headers and didnt 'accidentally' collect extra bits, it's wrong.
What would happen to me if I drove round the block sniffing packets, collecting a bit of this and a bit of that? I'd get nicked, that's what would happen.
They shouldn't be able to get away with digital kerb crawling, just because they are a company.
Leave the return key alone okay?
We're told a lot of things. Doesn't make it reality, no matter how many times it's repeated.
"We're also told - time and again - that Google is a company run by engineers."
Yep, we are told that ... by google marketing. But who believes it, outside of google marketing and other marketing companies world-wide which use google data to push their tat? google is a pure marketing company, not an engineering firm. Anyone who believes otherwise is deluding themselves. Including google's engineering staff.
Bottom line: That data was collected and stored intentionally.
Re: Here's the truth
"This kind of data is worth a fortune! Imagine being able to serve someone an advert for a shop, quite literally, around their own corner rather than just in their country or city."
Has it never occurred to you that people might like to receive ads that are related to their current location? While, obviously, I'd prefer to receive none at all, I do live in the real world and know that web sites have to be paid for. It might as well give me links that are known to be more accurate.
Something of benefit to Google and something of benefit to the end user aren't mutually exclusive, you know.
"Has it never occurred to you that people might like to receive ads that are related to their current location?"
Has it never occurred to you that if I want that information, I'll ask for it? And that if I don't want it, I also don't want people harvesting and processing information about me and my family's network configuration and browsing habits?
No. Thought not.
I'm quite happy to use Google for search. But they can prize my data from my cold, dead hands...
Then let them ask me
for my data and my consent. If it's a useful service I'll give it to them.
"We never make misteaks."
Nope... G never makes misteaks. Oar dew thay?
This wasn't an accident
It's quite obvious what they would use this data for -- to pin down as many Google Account holders as possible to geographical locations. By sniffing Wi-Fi public data they could extract the public IP address associated with each network. It would then be a matter of two join operations to link physical addresses with GMail addresses. Why bother? To target local advertising and to be generally creepy.
They don't need WiFi data to do that
Maybe it's different in the UK, but in the US, IP addresses from the major ISPs can be located down to a pretty small area - a couple of blocks in a suburban area. Google doesn't need WiFi data to pin GMail addresses to physical locations, IP addresses are good enough.
Viva code reuse
Code reuse. Does ya every time, all the time.
Knowing how software is written on this particular occasion I am inclined to believe them. Someone simply lifted a piece of code to catalogue MACs and active endpoints and ignored or did not notice the fact that it was doing so by packet sniffing.
The question here is slightly different.
Even if payload is discarded that still leaves Google in a posession of a very large dataset of MACs being used on wireless networks and their locations. MACs are personally identifying data. They identify my laptop as uniquely as a serial number. Further to this, Google is in possession of enough data to correlate the MAC to a person for a large number of cases.
This well beyond 1984 and clearly in the realm of Brave New World by now. What's next?
@ Anton Ivanov
"This well beyond 1984 and clearly in the realm of Brave New World by now. What's next?"
Google sponsors Paedobear?
See? That didn't hurt. I'm really your friend.
Expect more harmless accidents to come from Google. It's desensitizing public to privacy intrusions and abuses of personal data, and it has been going on for years now.
Microsoft was called evil because they fought dirty to take control of your software. Google is all that and they want your data too.
Why they did it
Because they could. Google have grown massively without any apparent detailed strategy, and I think it is because they have a simple strategy. That is to assume that the whole world is theirs to do with as they will. With this approach they get the maximum out of their efforts because they set no bounds on themselves and are only constrained when the world pushes back. Which isn't often for some reason.
Pirate logo because the above is true pirate philosophy.
The World is not Enough Oyster
The True Pirates Konquer Better Betas and Greater Vistas, the Perl that is in the World their Oyster is what is shucked from you, yours, and Mind.
I don't want anyone to have a database of Macs, SSIDs and locations. Never mind the payload.
Only a tiny fraction of WiFi airpoints are public hotspots. It's an invasion of privacy.
I don't want people to be able to type in my car registration plate or Social Security number and get my geo lat long either.
Oh ffs, who the feck cares?
I read on the Kismet project page a few weeks ago that Google uses Kismet in their cars...
Now that sounds strange...
So let me see if I got this right....Google have suddenly 'discovered' all this data which they 'accidentally' captured and they have moved it all to four discs for someone to delete for them.
Anyone else see the holes in their story such as the missing audit trail to show that this is all the data and the only copy of the data? Do they really expect everyone to just say 'what nice chaps for admitting their error'? Any risk that 'learning lessons' will include managers responsible for this group getting the boot? No, thought not.
I may not be paranoid but even I begin to get the feeling that Google (amongst others) are taking the p*** with privacy but then again if the CPS take no interest when large scale snooping is reported to them then why should companies see any risk in doing this.
Google is a company run by engineers
So was Penemunde.....
What grows makes no noise.*
"We're told a lot of things. Doesn't make it reality, no matter how many times it's repeated." .... jake Posted Tuesday 18th May 2010 05:17 GMT.
That should be carved in stone above the every political portal and most definitely above every entrance into the Mad Houses of Parliament, jake, for in there can one find every hue of that particular delusion and its championing Prime Fool and Sub Prime Ministerial Tool.
"We're also told - time and again - that Google is a company run by engineers." ..... Some in a world all of their own would be thinking that it is run by a wing of the Pentagon and NSA, whilst others would counter with "Don't they wish it were so".
The Simply Complex Truth though may be just that IT is run by IntelAIgents remotely Hosting and Hosted in Cloud Layers ...... which is AI full enough present disclosure of Base Alien MetaData Ventures for any Competent Crusading Virtual Jihadi Venting a Bent for ReModeling the World with and within its NEUKlearer HyperRadioProActive IT Plumes and Strata.
* A German Proverb and Germane ........ in Vorsprung durch AITechnik Circles/Orders/Camps/Global Communications HQ/CPUs.
People here (mostly the paranoid idiot ones) need to realize that Google handles data through programs, meaning algorithms. There's no super-evil master looking at the data, just code extracting, filtering and matching.
I can quite easily understand that somebody grabbed the code from somebody else in the company to quickly capture and store wifi info, not realizing the code did more. And when the combined data is 600GB, I wouldn't think much of it either, certainly not if my usual datasets are petabytes...
So yes, they made a mistake, but a programing bug, not snooping on purpose. After all, they are well aware that snooping on data like that would end them up in jail. So the developer who is responsible needs some good kicking.
You mean, like...
... people who drive without realising they were drunk, and kill someone, should just get a good kicking (as opposed to the others, who get jail time)?
Oh yes, when the Dutch gathered data on the religion of its population in the 1930s, there was no super-evil master looking at the data then either. That came later, and by f**k didn't they find it handy that all that data had been collected and was available.
If you get the chance, visit google, and you'll see. It is a hi-tech company run by engineers. If you look at what they do, you'll see that their products drive their success, not their marketing. Or how many ads do you see that are fro google itself?
But of course, it is much easier to do some paranoid blind google-bashing from your lazy chair like most of the readers here (and El reg included)...
Funny you should mention that.
I don't see any ads from Google itself. In fact, I don't see any ads from anybody at all. Firefox and a few plugins deal with that.
"If you get the chance, visit google, and you'll see."
Been there, done that, seen it for what it is. Refused the job offer.
"It is a hi-tech company run by engineers."
No. It is not. It is a marketing company run by shortsighted gits, with no interest in anything but this calendar quarter's bottom line.
"If you look at what they do, you'll see that their products drive their success, not their marketing."
Their product IS marketing. They sell eyeballs. YOUR eyeballs. Do you get a cut? Or are you being used without permission and/or pay? There is a word for that.
"Or how many ads do you see that are fro google itself?"
I don't see (or hear) ads. From anybody.
"But of course, it is much easier to do some paranoid blind google-bashing from your lazy chair like most of the readers here (and El reg included)..."
The irony is almost tactile ... Please, do continue :-)
were they collecting "WiFi network data like SSID information and MAC addresses"?
The Final Straw
Right, that's it! That's the final straw.
I'm deleting my Facebook account RIGHT NOW!
Didn't Apple buy a company that basically did *exactly* this when it developed it's CoreLocation technology? I though CoreLocation was able to use Triangulation, GPS, and WiFi hotspot MACs to try and work out where it was.
I fail to see why this venture by Google is bad mmm-kay whereas the system bought by Apple is somehow OK?
I'm not legitimising either company, but I personally do not feel that MAC+Location is something that is private. I broadcast it. I've chosen to do this. If I don't want someone looking at my TV then I have to close my curtains and the same is true of my wireless network. If I don't like it, then I'll turn it off and plug in instead.
I can't have it both ways.
- NASA boffin: RIDDLE of odd BULGE FOUND on MOON is SOLVED
- Pic Mars rover 2020: Oxygen generation and 6 more amazing experiments
- Microsoft's Euro cloud darkens: US FEDS can dig into foreign servers
- Plug and PREY: Hackers reprogram USB drives to silently infect PCs
- Boffins spot weirder quantum capers as neutrons take the high road, spin takes the low