Perhaps #
Posted Tuesday 9th December 2008 21:04 GMT
the user agents are really there, it's just a bit too cloudy to see them properly?
<getting coat>
Posted Tuesday 9th December 2008 21:04 GMT
What they want to see?
Posted Tuesday 9th December 2008 21:04 GMT
They are using Windoze but feel ashamed of that (or afraid of being fired?), that's why...
Posted Tuesday 9th December 2008 21:04 GMT
it sounds more menacing and scary to say they are cloaking
total non-story
This post has been deleted by a moderator
Posted Tuesday 9th December 2008 21:04 GMT
There might be a non-sinister explanation. A lot of the work Google does is related to quality, so anyone in the anti-spam team, AdWords quality, fraud detection, search quality, etc. At http://www.justlanded.com we have seen Google servers pretend to be different O/Ss and user agents from the same IP at the same times. Yahoo do the same stuff and from time to time we have to unblock their IPs when they use things like Wget to pull down thousands of pages (typical email harvesting or scraper behaviour).
People will see black helicopters everywhere... that's not to say they won't be launching a distro or some other MicrosoftWhack though...
Posted Tuesday 9th December 2008 21:04 GMT
the user agents are really there, it's just a bit too cloudy to see them properly?
<getting coat>
Posted Tuesday 9th December 2008 21:04 GMT
I wonder if it's just someone that is "paranoid". You know, flash off, no java, no javascript, clear the cookies and browser history all the time. Some people at google might just be "Ha! I'm going to clear the user-agent string too!" Or testing anonymization tools. Something like that. I just don't see google hiding a new OS by clearing it all out entirely. Maybe it's even just an internal test build of Chrome that had a bug making it leave the user-agent blank 8-).
This post has been deleted by a moderator
Posted Tuesday 9th December 2008 21:17 GMT
The most likely reason for blank user agents: the Googlers have decided that they want to encourage websites to be standards compliant instead of detecting the browser type and building a page for that one. This sounds pretty consistent with a company that has just released a minority browser platform.
Posted Tuesday 9th December 2008 21:49 GMT
As I recall, blanking or replacing the user-agent string is a standard feature of Squid (and presumably other proxy servers as well). OpenBSD's pf firewall has a "modulate state" option, which does something similar on a TCP/IP level (randomising all the parameters, making it hard to identify the OS generating the traffic).
If I wanted to hide my secret OS/browser, I'd have it report itself as something like a Subversion build of Firefox/Gecko running on WinXP - looking normal in logs, while having an obvious explanation for any odd behaviour server admins might notice (it's a work-in-progress version of an open source browser, of course it's not acting in exactly the same way as the last released version!).
Blanking the user-agent, on the other hand, would make sense in two ways: first, as a paranoid sysadmin wanting as little information getting out as possible (so you blank user-agent and probably have a firewall randomising parameters too) - second, to help catch sites which are running spider-traps which serve up pages of link-spam to anything other than IE. (In fact, the comment about these 'appearing to be real people not spider activity' could be exactly the point: comparing the pages seen by real people - and their proxy - to the pages served up to Googlebot.)
Or option 3: they don't want the world knowing that for all the hype about using 'Goobuntu' and having their own web browser, 90% of their staff are still using IE 7 on XP!
Posted Tuesday 9th December 2008 22:52 GMT
There is no way that google don't proxy outbound traffic and knowing them they are not buying stuff off the shelf when they can make it themselves. It's eminently possible that they have their own proxy that doesn't put in a user agent.
Where's the 'slow news day' icon?
Posted Wednesday 10th December 2008 01:42 GMT
and come up with a "windoze" compatible op and put MS to the sword?
Posted Wednesday 10th December 2008 01:42 GMT
Some cheaters send one set of content if they see Googlebot in the UA string, and another if they see any other agent. So it makes a lot of sense to me if google double-checks the googlebot results by fetching the same URL again with a different UA string and seeing if they get the same results returned.
Posted Wednesday 10th December 2008 11:45 GMT
I've seen quite a few of these from Google, and reached the conclusion it was requests from mobile users passing through google where the pages are optimised for display on a phone. The name of the system escape me this early in the morning, but there was a lot of discussion of it some time ago on webmasterworld etc.
Posted Wednesday 10th December 2008 11:45 GMT
"The most likely reason for blank user agents: the Googlers have decided that they want to encourage websites to be standards compliant instead of detecting the browser type and building a page for that one."
My thoughts entirely. Perhaps the world would be a better place if *everyone* did that and the broken sites discovered that they weren't getting customers anymore.
Posted Wednesday 10th December 2008 16:02 GMT
I see google non spiders come and look at some of my sites, I sort of expect people in google to use the web themselves, and yes the header is fairly stripped, there are some blank ones, but we can all do that if we like.
The reason is the same as the no_exist-google87048704 they are seeing what the site returns to a browser with no user agent, or it is just some of them don't wish to mention the user agent, there is no law requiring it :)
Posted Wednesday 10th December 2008 21:03 GMT
"A browser user agent not only identifies the browser a machine is using, but also its operating system."
Not necessarily. RFC 2616 only specifies that "User agents SHOULD include this field with requests. The field can contain multiple product tokens and comments identifying the agent and any subproducts which form a significant part of the user agent." (§14.43) Operating system is not a significant part of the browser.
As several people have mentioned, Google may be testing the behavior of sites when they are not given a known User-Agent header. Sending an empty string instead of forging it (e.g. "fake user-agent") could be an attempt to make it less noticeable in logs. They're clearly up to something, and trying not to leak any information about it.
Posted Saturday 13th December 2008 13:05 GMT
So, mysterious blank user agents
Coinciding with some human ranking component to google search results
Makes for a nice little no-useragent widget of googoompa-loompa desktops that has them clicking happily all day, voting for search results, and improving their chocolate...i mean search results.
Hmmmm...