Attorneys on Monday accused Google of intentionally divulging millions of users' search queries to third parties in violation of federal law and its own terms of service. The complaint, filed in federal court in San Jose, California, challenges Google's longstanding practice of including search terms in HTTP referrer headers, …
Blame HTTP and your browser, not Google
Anyone want to point out that it's not google that sends the search terms to the site you visit, but your browser (assuming it's not Chrome of course). Sure, Google could prevent the information from being shared, but the same issue would still exist for every other website out there.
If they are really worried, I'm sure it wouldn't be to hard to extend/modify a browser to never send referrer headers.
I thought the HTTP referrer field is just the URL of the site you came from? As such, the problem arises because your search query is included in the URL of Google's results page.
There is no technical reason why Google need to use a different URL for different search results.
But yes, many other web sites do the same (including El Reg) and disabling HTTP referrer information is a browser setting.
Firefox/RefControl add-on, FTW
I just installed the RefControl add-on earlier this week, after seeing a related article on El Reg about advertisers and other scum scraping my referrer headers.
You're correct it's not hard, in FF at least you just need to set the config entry "network.http.sendRefererHeader " to 0 no add-ons required (although I assume these give greater flexibility). Can't say for other browsers. It will break a very few sites (usually secure ones) so I have a shortcut to a 2nd FF profile that allows sending the headers when needed.
Re: Sort of...
>> "There is no technical reason why Google need to use a different URL for different search results."
But there is: The HTTP protocol defines the "GET" method for queries that are cacheable and re-usable, and "POST" for submitting information to change server state. To support these features, the "GET" method includes variables in the query URL, and the "POST" method submits it in the payload body.
Therefore, it is natural that all web searches are conducted using the "GET" method, so that users may bookmark them or permanently store them and re-use them.
This was the original vision of HTTP.
But that's the technical rationale. That said, I agree that the "REFERER" [sic] header is abused and should be either rethought or better defined and controlled.
Re: Sort of...
This is standard browser behaviour of which practically every website with a single external link is also effectively doing, and has nothing to do with Google specifically, a fact I'm certain their well-paid legal team is going to open with.
Re: Re: Sort of...
Well yes, but there's nothing to stop Google taking the original GET request with the q parameter, for example:
And responding to it with a 302 redirect to another URL, sanitised of it's query terms:
Where resultset would be a horribly long ID to a look up table for the original query terms. The search result page is then atomic, cache-able, replay-able and bookmark-able without including the q parameter. If they did the same with the URLs they use for that dodgy onmousedown trick (or better still, stopped doing that altogether) then problem solved. Except...
Of course, if they started doing that, the anti-referer [sic] people would be happy but the web would be full of "Google is monitoring us with ID numbers" stories within days and the SEO world would be headless chickens overnight. Much as I don't like their sneaky practices, Google can't really win with this one.
@DZ-Jay, @Jeff 11
I accept that having search terms is good for _inbound_ links to Google (i.e., so people can post Google search links), but there is no need for the actual results page to have the query in the URL. For bookmarking purposes, a "permalink"-style search link could be included on the page.
In other words:
http://www.google.co.uk/search?q=stuff would still produce a search, but the results would be shown on http://www.google.co.uk/results (for example) with a permalink at the top of the search results linking to http://www.google.co.uk/search?q=stuff
As for the POST and GET stuff, the issue at hand is not the search terms being sent to Google, but being sent to web sites that you subsequently click through to. My scenario above would still use the POST/GET mechanism for submitting the actual search.
Is there a search engine that *doesn't* do this?
I like to know what search terms people have used to reach my website. That's not private information, by any definition. It's also the way the web has worked since the days of webcrawler and altavista.
Just what are they alleging? That HTTP referer entries are somehow illegal? Can't see that going far.
Same search on Google
Google puts everything but the kitchen sink into the URL.
And sometimes that's handy
Now provide me with a handy ixquick link to a search for "green teapots" or whatever ...
Now, someone who gets it! The problem is not the "GET" method requiring variables in the URL; there is a very good reason for this.
The problem is Google including this in the referrer header without the users' consent or knowledge.
The original intention of the referrer header was for sites to know how they were sharing their traffic. There is real value in this, but it is currently being abused and exploited by advertising networks. There is now very little reason to include the information automatically on every hyperlink click, and in fact, there are reasons for actually preventing this at times.
Ultimately, the user should have control, and like cookies, browsers should have reasonable default settings to protect the users' privacy.
What the website likes to know is not the point
When I walk into a shop, does the show owner have a right to know how I got there? Do they even have a right to know why I chose to enter their shop? Of course not.
Naturally they may ask, and I may tell them. But sometimes I may have good reason to withhold that information. I may not want them to know before hand what I intend to buy, or what I have previously considered, as they may use that information to "optimize" their sales pitch. Optimizing to their advantage, not mine. This is even more likely if the shop owner has this information already without me even being aware of it.
It's quite clear that what is happening here is that the search engines are divulging information to the end site that the user may reasonably believe to have been a matter between them and the search engine only. Again, if I walk into a shop, I'd be rather peeved to find my taxi driver having a whispered conversation with the owner about where I came from and how I got there. It may not be "private information", but that doesn't stop it looking like collusion against my best interests.
here you go...
Last time I checked, the referrer was generated by the browser, not by the site that presented the link that I clicked on.
But never let the facts get in the way of a lawsuit.
thanks for the article el reg
now is this a practice strictly limited to google, or do other search engines pass this info along in their headers as well?
search engines don't pass this information at all.
When you click on the link in the Google Results page (or any other web page, for that matter) your browser connects to the 3rd party web server and send a request. The request "packet" contains a whole bunch of information, including the address of the page that contained the link that you clicked on.
Browsers have worked that way since long before Google was even invented, and Google doesn't have anything to do with it.
all do and none do
Google could make the results page a POST form rather than GET, and pass the extra info that it needs (search terms, how many pages displayed, ...) in a hidden field. However this would cause small problems for users such as unexpected browser popups about refreshing POST forms, etc.
It would also prevent users from bookmarking a search result, or sharing a link to it; two very valuable features.
Let me say that I am against many things that Google is doing, and they are certainly abusing their position and expertise to exploit these features for financial gain; but the problem of privacy and specifically of the "referrer" header is deeper than merely switching protocol methods.
If you don't like it, ask Microsoft, Apple or Mozilla for an easier way to turn it off. It's not Google's fault. There are only two ways they could avoid it. Use POST instead of GET, which would make it impossible to bookmark a search result page, or do some kind of funky redirect.
The latter might be workable, but it would add unnecessary complexity and compatibility problems to what should be the simplest thing on the web, a link. In my experience, this technique is only used by particularly secretive web sites (pirates, hackers, perverts, etc). It's not a something I would expect to find on mainstream search engine.
Perhaps if Google added it as an optional feature some people would be interested, but to suggest they are negligent for not providing this is just plain silly.
I disagree only ...
... with the word "freaking".
Google is a free service surely if you don't like their privacy practices you would just not use a search engine.
it is everyone's fault
the browser does indeed pass on the referrer URL in HTTP_REFERRER
this stops people doing things like leeching comics from the sunday paper pages
google/yahoo/bing/whoever does _not_need_to_ set their search result pages to include the search query ... but it does, which is the point, really - and if you can, do check what it really does pass as the referrer, and it's not the URL at the top of the page, it's a modified HTTP_REFERRER containing a suspiciously unchanging string of letters and numbers at the end per machine query sent from.
web developer toolbar on FF has the option to disable referrer, works too
people are generally surprised if not infuriated to learn what they enter into one place ends up in another without due warning - sharing with one entity does not automatically bring to mind sharing with the world - and suggesting that's just how it is/always was is pretty naive.
Google can and should fix this
When you click on a search result, the hyperlink first takes you to:
When that redirect page loads, it then takes you to your actual destination. There's no valid reason for that page, whose only purpose is to make sure the URL is safe before sending you on your way, to have information like your search query (the "q" param).
@ Randy Hudson
That doesn't happen so often, maybe once or twice a week. I think it is Google doing follow-through to see which links you choose from the results offered, to determine if a site should move up or down the ranking.
As I write this, at 7h44 on Wednesday morning, a Google.com search returns pages with direct links.
URL changed via onmousedown
Google has been using a onmousedown handler on the <A> tag to change the HREF from a direct link to the via-google one for a few years now.
Try his: press-and-hold on a link, drag your mouse off the link, release the mouse, and then hover over the link again. The URL preview on the status bar should now be the obfuscated one.
However, the search term doesn't appear in cleartext for me.
Yes, I see if you click-hold the true link location shows up. Thanks for the tip.
Makes me wonder, then, why sometimes it isn't hidden?
I wonder how long it'll be until there's a Firefox add-on to pick apart Google links and go directly? Not that I'm *that* bothered, I don't search for weird stuff. Well, 8 bit tech might be weird to some, but you know what I mean. ;-)
if you're that bothered..
..just grab the result manually and paste it into the address bar :P
that's what I do.
The only impact that I can see is that the left CTRL key on my laptop is starting to wear out...
... block Google Analytics and DoubleClick using NoScript and AdBlock Plus, then use RefControl to ensure *you* determine what information gets passed on.
What utter cack
1. Yes, Google puts everything into the URL. This makes it trivially easy to code up your own search links (trust me, if it used POST data, this is somewhat harder, if not damn near impossible to do as a simple link).
2. It isn't just Google. Try http://www.bing.com/search?q=budgies&go=&form=QBLH&filt=all&qs=n&sk= [ditto Altavista and Lycos]
3. None of this is relevant. Passing parameters in the URL is a specified part of how HTTP works. There's the ?blah=meh form and there's the POST form. It's in the RFCs...
4. Google's string is http://www.google.com/search?hl=en&source=hp&biw=1024&bih=440&q=budgies&aq=f&aqi=g10&aql=&oq=&gs_rfai= and, asides from a bunch of empty parameters, the only privacy-busting thing is it has sniffed my screen size. Only 440? That's 160 short, must be the browser window visible area. Well, if it helps them sort out the icky YouTube layout, then fine... I'll give you some extra info. It's an eeePC 901. And here's a freebie for you. My browser's UserAgent is custom. ;-)
5. It is your browser generating the referrer text. You can turn this off (and if you can't, change browser!).
6. I do referrer sniffing on my blog to see what weird and wonderful search terms bring people to my site. Not so much of the wonderful, but there's "weird" by the bucketload. Loads of people are looking for VistaFont_JPN [it's the 4th/5th result, why are people coming to me!?]. Recent favourites - "bakable plastic cookies" and "cute jap chick with fukin enormus big tits" [sic!]. Cute Japanese girls do feature on my blog, but they're all dressed. Call me weird, but I'm less impressed by (half-)naked girls, there's no mystique when it's all on show. As for bakable plastic cookies, WTF? Is this a business opportunity I don't know about? Or are people finally being honest about those ready-made dough lumps you put in the over for ten minutes that are supposed to turn into delicious cookies but usually end up looking like baked cat sick...
http://www.google.com/search?... is not the URL that gets sent as the referring page. Google sends you to an intermediate page, and it's that page which gets sent. That page doesn't need any information other that the URL to which you should be forwarded.
I'm suing my appartments window makers...
I mean, everyone can look inside! And there is technology on the market that prevents this!
Or maybe I should just buy some curtains... or those fancy glossy windows...
Mine's the one with no pockets... after suing my coats company because they made pockets in it and I forgot my cell phone the other day...
Here we go again...
This is the electronic version of seeing which direction your customers have come from when they enter your shop.
It's the browser that does it, and it has nothing to do with Google. Some browsers can be stopped from doing it, if it concerns you. Just google "block referer header" to find out how to do it with Firefox and Chrome. (Use that spelling.)
By the way, Yahoo does it too. A Yahoo search URL will have ?p=search+string in it.
Oh, and Bing? Yes, you'll see exactly the same thing there, too.
Google's one is ?q=search+string.
Privacy lunatics are going too far
Since when did everything we do have to 100% private anyway? I've typed in 'sandwich' and google has shown me a link to boots, when I click to go to boots they know I searched for sandwich - big ******* deal.
It's as stupid as the numerous facebook articles - oh I created an account on a public website with my name on it so people can search for me by name, I'm now upset that people on the internet I didn't want to can now see my name. Retards.
Not really retards
I agree with your comments on the hype, but you missed the single most important point about this (and the Facebook debacle). Namely that the company in question 'says' that they wont let anybody see your personal stuff (in varying degrees) - which they fail to do. If the privacy sections of Goo and FB said 'we share with everybody, regardless' then I would fully agree with you.
no referrer data sent.
And the .co.uk version?
Job only half done.
So when you're googling for YouPorn...
... and then click on YouPorn, then YouPorn gets to know that you were googling for YouPorn. Well, of course, that's a big privacy violation.
Courts and the internet just don't match...
I don't agree everything Google does
But this sue happy crap has to stop.
The "best" solution might be, perhaps, if instead of passing the full URI in the HTTP Referrer the browser simply posted the URI of the referring document minus the query string. The problem there of course is that with Mod Rewrite and RESTful web services it's a little hard to tell which part of the URI is the document and which is the query.
Don't see how this is a Google (or any other search engine) problem though - they're just correctly following the HTTP specs.
BEST SOLUTION: install BeeFREE
It'll remove the HTTP referrer automatically, when needed. When you're moving from a server to another (server.com -> another.com). And, it's going to remove Google tracking links too.
from a slightly different perspective...
As someone who pays Google for ads I'd be a bit put out if I DIDNT know what search terms the user had used which then led them to click on my ad. Otherwise I have no way of knowing if i'm getting stiffed by Google since, like every other advertiser out there, I pay to have my ads appear when particular search terms are used. (as it is I still lack a lot of information to know if I'm really getting stiffed by Google anyway, but something is better than nothing)
The lawsuit stinks of frivolous and opportunistic.
And, as others have pointed out, if you feel that passing this information on is an invasion of your privacy, tell your browser not to do it.
To all morons
If you don't like passing on referrer information, turn it off. The only reason that website have this information is because *you're* giving it to them. If you don't have a browser that allows you to turn referrers off, get one that can. There's lots to choose from.
As a webmaster I find it most useful to see how users arrived at my sites.
Additionally, referrer checking is useful to spot image leeches.