Ah...
So *That's* what the big red button labled 'DO NOT TOUCH-EVER!' does!
My bad.
You can all relax now. The near-unprecedented outage that seemingly affected all of Google's services for a brief time on Friday is over. The event began at approximately 4:37pm Pacific Time and lasted between one and five minutes, according to the Google Apps Dashboard. All of the Google Apps services reported being back …
"Since switching to duck duck go (as of last week) this one actually passed me by totally.
It's a bit weird not having predictive search results appearing anymore, but I'm sure I'll get used to that (Again)."
Don't forget Startpage/Ixquick, the only search engine with a European Privacy Seal.
I use DuckDuckGo and Startpage regularlly. Startpage for over 5 years and DuckDuckGo ever since I saw their only (?) billboard in SF about a year ago.
Before those I used Scroogle, a back door into Google using an old API that didn't include the more sophisticated tracking.
In one place I worked, our machines would have most of their problems on Thursdays. Different machines, different architectures, different models of storage devices... we couldn't figure it out. This was a raised floor, halon protected computer room with a combination lock on the door.
So, with nothing else to try, one Wednesday I prepared to spend the night in the computer room. Sure enough, about 2:00 AM the cleaning crew came in with a big buffer machine, preparing to run it over the raised tiles.
I chased them out and next day confronted the facilities manager about (a) giving the cleaning crew the combination to a secure room, and (b) letting them bang a floor buffing machine against our disk arrays.
He looked at me like a guy who'd seen his first kangaroo. He couldn't fathom why I wouldn't want the floors polished in the computer room. I finally gave up, got some tools, took the lock apart, and changed the combination.
As I write this, I now realize that I did not pass on the combination when I left the company. Oops.
Oh, Apple fanboi alert!
Compared to Apple? I presume? Absolutely.
Imagine an Apple search engine, where content is filtered beyond your control/knowledge and you can only experience the Internet as Apple thinks you should experience it. Total information freedom nightmare. No thanks!
You Apple sheeple can keep your "think different". Go buy another overpriced "ooh, shiny" iCrap tablet that Samsung will out perform in every possible way for a whole lot less.
Moron.
Someone flipped the switch to 'Magic', see the Jargon file entry.
Not necessarily. I worked at a place with a Push big red button that was not protected, was right beside the exit, and more importantly, right beside some equipment that I occasionally had to lean over to work on. The second time I tripped the power off, my boss warned me that 1 more time and I would be fired. The third time it went off, I was at my desk, jumped up and screamed "NOT MY FAULT". The big red button was shortly thereafter covered by a flip up plastic case.
Methinks more likely a failure of our heroic churnalism soviet. According to the attributed source: "Google.com was down for a few minutes between 23:52 and 23:57 BST on 16th August 2013." which fits perfectly with the lifted graph.
I suppose there's scope for some disparity as the fault propagated across Google's infrastructure but some reference to the obvious contradiction in the article is surely warranted.
In lieu of the Reg headstone icon which seems to have been removed for our protection -->
The artificial singularity that powers the Googleplex creates a non-negligible effect on spacetime around Mountain View. The graph shows the time that their servers perceive. Since the singularity slows down time, it took an extra 15 minutes after the event began in the outside world before Google's servers registered it. The stalwart team of boffins at Vulture Central merely corrected Google Coordinated Universal Time to regular Pacific Daylight Time.
The US Army and Air Force use ferrets for cable runs at Site R and Cheyenne Mountain. Some Airman at Cheyenne Mountain AFS came up with idea after watching his pet ferret drag a loose CAT5 cable through a cardboard tube while trying to think up a way to do cable runs easier than how they'd been doing them previously.
I believe FEMA uses them at Mount Weather too, I'm not 100% sure on that facility but it would make sense. Tearing out walls in bunkers under mountains isn't cheap or easy. And the Military as well as DHS tend to prefer cheap and easy, especially in places like Raven Rock and Mount Weather where by their nature have to be up and ready 99.9% of the time just in case.
So anyway NSA using hamsters may be closer to something "fo' reals" than you might think.
No hamster icon, but Paris is about as intelligent as a small rodent, and much less intelligent than the Musteladae (ferrets, weasels, etc).
Excellent point about DNS... you "think" it was affected... did you experience any DNS disruption directly? Or have you seen any data supporting this?
Just out of curiosity, why choose Google for DNS rather than OpenDNS, Cisco or whatever? Doesn't the Googleplex know enough of your business?
@AC 03:57
> Just out of curiosity, why choose Google for DNS rather than OpenDNS, Cisco or whatever?
Are you saying that Cisco offer a DNS service? If so, could you please post the IP address, as searching -unsurprisingly - brings up lots of links on how to configure a router to be a DNS server?
@AC 5:42
171.70.168.183
171.69.2.133
128.107.241.185
64.102.255.44
Perhaps ironically under the circumstances I had to Google it too. I've settled on OpenDNS myself, not least because they were the only service I saw competently and promptly address that phishing/poisoning débâcle a few years ago. The redirection for unresolvable queries is a bit naff though. Still, gifthorses...
Found a pretty comprehensive list here: http://wikileaks.org/wiki/Alternative_DNS
Anyone any idea what might have prompted the downvote?
OpenDNS is not really a good thing to use for a server that needs to know if a hostname is valid or not. OpenDNS will reply with a fake address that points to them for invalid hostnames. This is cool if you want some special notice web page that the hostname doesn't exist page etc... but for a mail server, not knowing the hostname is not valid is a waste of system resources... NXDOMAIN is the better response.
Google DNS is fast, though using resolver.qwest.net is faster at the moment.
> did you experience any DNS disruption directly?
I did. Across a wide range of website. El Reg being one.
And strange things as well. El Reg loaded, but only the bare bones HTML. Looked like 1996.
I was getting 301 Permanently moved messages on about half the websites I visited this morning or automatic "Moved to Here" on-the-fly links.
But it was completely inconsistent.
Whatever happened, and I think it was something external to Google, was very significant.
Everything seems back to normal, but something major sure as hell happened.
The sites not rendering properly would probably have been due to javascript errors because they were dependant on google hosted JS.
As for the odd redirects, could that have been because the sites were hosted in some way on google services, blogspot, etc?
The other thing I should add here, is that a user on another site was getting 60% packet loss pinging google during the outage over IPv4, but had perfect connectivity over IPv6.
Reminds me of a cartoonish jigsaw puzzle that's an old fave of mine. It's called "Computers: The Inside Story" and featured a minicomputer (bear with me—the puzzle IINM dates back some 30 years or so). Most of the joke was all the funny things that went on "inside" the minicomputer, but up top was the computer's responses to an unstated question. It isn't long before you realize the query was, "Why did the chicken cross the road?"
I agree with Cliff...
I use the ubiquitus 8.8.8.8 and 8.8.4.4 (?) and if DNS fails then pretty much all web/ftp/rdp/name-your-service-here fails unless you are using direct IP addresses or local host file entries.
As big as the Chocolate Factory is (and it IS big) I think the DNS service is so central to what people conceive as the internet working, or otherwise.
Still 2 minutes... bad Google, bad Google!
I believe that is 4.4.4.4. I have seen this all over the place as a DNS server forward, or in individual workstation settings at sites where Active Directory refuses to work properly. ISTR that used to be owned by MCI/UUnet (or some other equally obscure provider,) and is now Level 3.
Google turned it on again? I thought Genesis did that!
"The Reg contacted the folks in Mountain View to see if they can account for the outage, but a spokesperson only directed us to the aforementioned dashboard. We'll fill you in with any further information as it emerges."
Don't hold your breath, though.
To help prevent inadvertent collection of personal data, as reported in earlier stories; the NSA and Google are proud to unveil the new Fast Updatable Collection Keystone Utilities (FUCKU) system which went online at 21:53 Friday, August 16th.
Fully integrating the new system required a brief restart of the Internet our systems. We do not anticipate further interruptions of service.
I agree. This length of time is probably about the time required for a skilled spook to install new hardware at Google.
If you have a choice between buying a product or service from the USA and somewhere else of comparable quality, choose somewhere else. Hitting the entire USA in the wallet is the only way to stop this crap.
Yeah I was wondering the same thing. If your providing for a huge amount of traffic there is gonna still be a large amount that is semi redundant to account for load sharing and redundancy in the case of an outage. While google might own large amounts of its own fiber, there are large tracts of fiber that it still doesn't own, would would show across the public networks.
Would Love to know the amounts public and pirvate bandwith they have and saturate :D
What I'm impressed by is that everything seems to have run perfectly once Google came back to life.
What do engineers/admins of these kind of huge systems like this think? I would have expected load balancers etc to have gone out of whack, after receiving normal traffic, zero traffic then 50% above normal, in the space of 5 minutes. That strikes me as the perfect recipe for a cascade failure we've heard so much about of late.
If you enjoy providing personal data to a multi-billion dollar international advertising machine, follow your bliss. Who am I to argue, Muppet?
I *know* what this network is for, I helped build it. And the gootards aren't allowed on my portion of it.
(Apologies for the paraphrase, Russ ... http://www.eyrie.org/~eagle/writing/rant.html )
"Any luck on getting the publication ban on your work on the Manhattan Project lifted yet, Jake?"
I'm more interested in the story of how he stole fire from the gods to give to us. Is there anything this man hasn't done? More importantly, why does he insist on telling us about it all the time? Still, I suppose a 13 year old in rural Cornwall has to find something to do ...
Steve wasn't a close, personal friend. He was a good neighbor.
I didn't build google. I did mentor their founders. Sadly, I failed. The twats.
I did help work out how to transfer the existing NCP ARPANet to the existing TCP/IP network, though ... and my system has been up, running & available since Flag Day.
I'm only in my mid 50s, the Manhattan Project was before my time.
Funny enough it had perfect timing for me
I was watching a video where a guy was talking about Nintendo and showing his TV while using a Wii U, and he said Now before Nintendo gets my video pulled I have to give out a dis... video died
I was like wow that's odd hit refresh got server not found, and was like damn that was some DMCA take down.
It will be interesting to see how Google explain this event.
It is difficult to think up reasons for the outage that don't put dents in Google's claims of being reliable enough to trust ones entire business to. After all, if you've trusted your entire business to Google's cloud (Docs, mail, everything) then when Google are down there's nothing you can do; you're not working. There's not even a phone number you can call.
At least if you have your own IT you can go and harry the IT guys.
Companies are very bad at risk management. It always seems that they refuse to consider highly unlikely scenarios that have devastating consequences. For instance how many outfits are there that have all their IT in a cloud and have an effective Plan B in their sleeve just in case? Companies like Google are highly unlikely to go off line completely for a long stretch, but if all your IT is Googlised and they do vanish for a few days, your business is guaranteed to be in deep trouble.
So what exactly would a good Plan B be? There's no easy way to start using another cloud because there is no way to do a bulk export of everything (docs, calendars, contacts, sheets and mail, etc) that you can bulk import into another cloud. In fact such a thing would be the very last thing that Google, Microsoft, etc. would want to give you. I know that you can get at the data piecemeal, but file by file and user by user exports and imports is no way to perform disaster recovery.
Synchronising a cloud with your own IT is more like it, but surely the whole point of a cloud is to avoid having your own IT. Such synchronisation is available only because the cloud providers offer it as a way to get going with a cloud; I don't expect that it will be something that will work reliably and well forever.
And if you're going to have your own IT then what exactly is the cloud for anyway? Backup?
To me and presumably anyone else that cares about coping with the ultimate What-If problems clouds just don't meet the requirements. However, with the likes of Microsoft, Apple and Google trying very hard to push their customers onto their respective clouds and a large be action of those customers being happy (or stupid) enough to go along with that, what choice will there be for those that want to do things on their own IT?
Clouds also bring big national risks. Say Google got to the position where 50% of American companies were wholly dependent on Google's cloud for their docs, sheets, contacts databases, etc. That would mean that 50% of the US economy is just one single hack attack away from difficulty and possibly disaster. Is that a healthy position for a national economy to be in? Isn't that a huge big juicy target for a belligerent foe, be they an individual or nation state? After all, Google's networks have been penetrated before (they blamed the Chinese as it happens); why not again?
"Companies are very bad at risk management. It always seems that they refuse to consider highly unlikely scenarios that have devastating consequences. "
I agree, but I think it's more a fixation trying to plan for the last disaster, not the next. In a way similar to airline safety, all the checks are to prevent the last hijacking/bombing not the next.
"Companies are very bad at risk management. It always seems that they refuse to consider highly unlikely scenarios that have devastating consequences. "
I used to work for a large British company.
One of their Manchester offices was damaged by an IRA bomnb in the 90's.
The staff were relocated; the servers replaced; but whilst the backups had been completed diligently, and kept safe in the firesafe, no one was allowed access to the site to retrieve them for many weeks, by which time they were virtually useless....
I guess no one had thought of off site backups....
During my PhD, back in the late '80s/early '90s so before anything net other than email and usenet my thesis was stored on 3.5" 'floppies' (1.4Mb eventually). I had three sets:
A daily working set (didn't always have my own computer with a hard drive)
A travelling backup set that was updated daily and lived in my backpack (in a plastic disc box)
A home set that came in once a week to be updated.
The lab postdoc told of guy back before computers were available for such tasks who gave his handwritten thesis manuscript to a typist to type up, as was common practice. She put it on the back of her moped and set off across town. When she got there only a few pages were left. This was my motivation for backing stuff up. As well as an incident during my honours year (we were the first year to use computers to produce our theses). I took a 400k disc out of a computer, put it in my lab coat pocket and demonstrated a physiology lab. When I went back it would not work. Fortunately I had a backup but I lost a morning's work.
Floppies bit me too, ended up having to get the bus home, copy the files onto another disk then bus back into down and just made the hand in!
After that I got into the habit of emailing my NTHell world account and hoping Eudora would pull it down before I busted my mailbox limit (or the dial up connection dropped) :(
For my final year I got into the habit of emailing my final year project to myself every time I was about to shut my laptop down - came in rather handy when I deleted a completed section and didn't notice for a week, and when Office decided it was going to corrupt the document because I'd had the audacity of editing in Office XP and Office 2003.
One copy on my laptop - a more often than daily copy in my Google Mail account* - and then Eudora pulling those to my desktop at home, and back to my laptop as I went along :)
* handily activated just in time for my final year to start.
@Tony Green:
My thoughts exactly, and I knew such a comment would get wisearse replies about using Google and understanding timezones...
Way to miss the point, guys.
No offence to our West coast friends, but the time should at least also been displayed in .co.UK time!
This post has been deleted by its author
El Reg have a very clear, years-old policy that all articles are published based on the conventions of the country in which it was written. In this case, it's clearly stated it's the San Francisco office issuing this article, so PST, and US English.
It's similar for their Australian office.
They don't have the personnel to convert every single article to make it sound like it was written in London - especially not at 1am GMT on a Saturday morning!
I thought my net went down, until I noticed IRC was still chugging along as normal. Then I thought it was virgin media DNS as by the time I entered googles DNS in my system instead, it appeared to work, so i blamed virgin media, my bad :)
Only found out it was google today by reading this.
Murphy was an Optimist.
I've seen lots of presentations by Google about how they design for reliability and test failure; It will be interesting to see the Major Incident report for this and what lessons can be learned.
Invent a fool-proof system, someone will hire a better fool.
According to http://www.google.com/appsstatus#hl=en&v=status&ts=1376739247223 most services were down for 11 minutes.
But did Google Search or their numerous domains https://en.wikipedia.org/wiki/List_of_Google_domains also go dark?
I also find it strange that 40% of world wide internet traffic would be affected by that.
> I also find it strange that 40% of world wide internet traffic would be affected by that.
There are plenty of people who use google as their address bar. Instead of going to facebook.com they just type facebook (or more likely 'f' and it gets autocompleted to facebook) and then click on the first result in the corresponding Google search. For them Google down = Internet down.
I am guilty of doing this with some sites myself. If you don't remember the URL the default behavior is to do a search for the site. From them on the first autocomplete result in the address bar will the the search for the site instead of the site URL so it's pretty much a self-perpetuating behavior. With search results being displayed as fast as they usually are there are no incentives to modify such behavior.
Bookmarks/favorites? Never heard of them....
But that is only HTTP and HTTPS traffic, which accounts for only a small fraction of the overall internet traffic.
According to the latests surveys, in N America BitTorrent accounts for 33% of upstream traffic and is at the top and Netflix is at the top of downstream traffic with 40%.
Firstly it wasn't all internet traffic, it was only pageviews.
Considering the number of websites having a dependency on some form of google service,
e.g. google's analytics, tagservices, syndication/ads, apis, and all their other javascripts,
And that browser rendering may stall when that component is unavailable.
So people were sat waiting for pages to load.
Sounds like one hell of a multiple site failover, assuming they've managed to completely automate it (that's the hard bit, BTW). It can be done (and maybe was done deliberately, as part of a drill). I hope this wasn't caused by a single point of failure; there simply shouldn't be one.
Or is the fact that (what ever the cause for it going down, which of course, it shouldn't have)), the whole of Google infrastucture (ie something dealing with 40%) of the worlds traffic) came back after only a few minutes downtime. I would have thought that was a pretty impressive feat.
I mean ever.
But I can well believe the 40% of traffic - Youtube alone is probably most of that.
On top of that, a lot of applications poke Google.com to determine if they've got Internet access or not - because they are quite simply the world-wide server farm that's least likely to have gone offline.
Along with every other commentard, I would really, really like to know what happened - and how they fixed it so fast. Most of the other "cloudy" services don't appear to have even realised they're down in the time it took Google to bring it back up.
I was trying to use a couple sites that use Google's doubleclick ad network and the pages would not load. At all. I know Google rarely goes down but it seems incredibly stupid to me that Google's ad network holds its customers completely hostage and prevents them from loading at all if it can't be found.
more likely bad site design, I know I would never build anything that relied on outside connections to load...
And for that purpose part of my testing is usually kill the internet on the test server, then try the site, see how it works, ensure I have no external dependencies remaining... (sure SOME things I offload onto cloud storage, such as media etc, since their content delivery networks are better than a single server)
Any relation to microsoft live/outlook is down all morning already? I'm not a mobile user, but my provider is in Indonesia though.
Outlook
There's a problem with Outlook right now.
There's a problem with Outlook right now.
Details
Problem A small percentage of mobile users may experience intermittent issues while syncing emails. August 18 1:36 AM
Report a problem
If you're experiencing a problem that isn't listed here, please report it. To see recently resolved problems, go to the History page.
Was this information helpful?
Tell us what you think
To be honest, I didn't notice. I'm one of those old fashioned persons that still stick to AltaVista. I have no G*-whatsoever account - no Gmail, no Youtube no Gdocs - no nothing. I'd rather be off- line then using Chrome. Still, my world hasn't come to an end yet. Go figure.