back to article When a DNS outage isn't an outrage

A little over a decade ago I registered my very first personal domain name. This domain was not registered for a client or an employer. This was a domain name all my own. When I picked my DNS provider I picked one who was affiliated with the local technology magazine, and I picked them because they were Canadian. It was the …

COMMENTS

This topic is closed for new posts.
  1. TonyHoyle

    Interesting advert

    Can you mark them in future with 'advertising feature' so I can avoid them?

    1. Ben Tasker

      Given that it relates to the (relatively) recent DDoS of EasyDNS I'm not sure I'd call it an advert. Besides which, it's actually useful to find out which providers are any cop especially when there's a good chance a customer might ask you.

      Trevor gets a lot of stick in comments from time to time, but being accused of advertising is a new one on me!

    2. vagabondo
      Facepalm

      Re: Interesting advert

      But not for the SysAdmin.

      Standard practice is to use multiple nameservers, on different networks, and at different locations. Then the major worry is not nameserver downtime, but hijacking.

      1. Lee Dowling Silver badge

        We need to coin a new word.

        Reg-vertisement?

        Sorry, but there was no need to name the DNS provider, especially after they were named only the other week too. Naming them did not add anything to the article, and a lot of articles on here are posted with words like "big name company" or "well-known DNS provider", etc.

        I don't read the adverts, Reg, whether you put them in the articles or on the side. The more effort I have to go to in order to avoid them, the less I'll visit.

        1. Anonymous Coward
          Anonymous Coward

          Advert or Recommendation?

          We see enough articles reporting on failures and outages. Why does a company that might have got it right have to remain anonymous?

          I'd be the first to shout about a lazily copied press release --- or just vote it down and turn away, but I have no doubt that Trevor is reporting his own experience.

          I was worried about all this "cloud" stuff, then I remembered network diagrams from before the WWW was invented, let alone *The* Cloud with, err, clouds in the middle. Then I felt embarrassed.

          1. Trevor_Pott Gold badge

            @Thad

            Own experience. I'm a customer; I pay them money. That's an odd form of advertising where you pay a company so you can write about them! That said, it could be the "next thing" in Apple journalism, so I should tread lightly...

            1. Anonymous Coward
              Anonymous Coward

              Not so fast!

              How many times do you see people who have paid to wear clothes with the name of the maker writ large on them. I always say that they would have to pay me if they want me to wear their name in the street!

              But, I have no doubt in the integrity of your article. I'm sure you have better things to do than to write advertising copy.

              Credit where credit is due, I say: It's good to see a company that has given good service getting that credit.

              1. Trevor_Pott Gold badge

                @Thad

                This isn't my first article where I decided "these people done good, let's tell the world about it." I hope it won't be my last. Kittens!

        2. Anonymous Coward
          Anonymous Coward

          This is called an advertorial

          It looks like an editorial, but is an ad.

          However, in this case I'd say it could've been written better to avoid the suspicion that this was an advertorial when it wasn't.

          :-)

    3. Trevor_Pott Gold badge

      @TonyHoyle

      Most companies I deal with suck. From the vendor and manufacturer of my microwave to my DNS provider. Our blame-the-victim culture combined with hiring the absolute bottom of the barrel for every possible position means that I only rarely encounter a company I find even remotely useful.

      When I do, they stick in my mind. So late at night, when I scrabbling around thinking "I really should write an article," every so often I decide to balance all the negative press we read in the world with something more upbeat. Tell a tale of being happy with your vendor/service provider for once.

      No advertising involved; just a weird personality quirk that dislike "all negative all the time" in my morning newspaper.

  2. nichomach
    Thumb Up

    It may be a Canadian thing...

    ...since I use Sherweb for my hosted Exchange service. One single, sad, solitary ("Forever Alone?") mailbox, yet on the only two occasions (over some years) when I have observed an outage they have kept me apprised of progress at each stage via alternate means of contact and behaved with impeccable professionalism throughout.

  3. Anonymous Coward
    Anonymous Coward

    For the client end

    If you have a BT line, remarkably @BTCare has been a cracking service, responding quickly and sorting out engineers without any of the normal painful forms and phone calls.

    Up with this sort of thing.

  4. Coofer Cat

    Thin Margins

    I used to work for one of the "tech giants" (not mentioned here, but you know them). They used to work on tiny margins - each of their (aggregated) millions of users generated something like $10 a year in revenue, so it only takes one phone call and that user starts costing the company money. Hence, they try to dissuade users from phoning them. My point is that just because they own and run thousands upon thousands of servers doesn't mean they're making so much money they can offer the personal touch to every one of their millions of users.

    The likes of EasyDNS probably don't make a load off each domain they sell/operate either, but as a customer who pays them directly, they have a clearer relationship and responsibilities to you. If they have to charge you $11 instead of $10 to run a domain, then you're probably happy to pay the increase, knowing they're going to answer the phone to you.

    That said, post mortems are a significant weakness in IT generally, and the "tech giants" are no exception. In their defence, because they do run thousands of machines for millions of users, it's not always obvious what the exact problem was, or indeed what internal policies to change to mitigate the problem until days, or even weeks later. Still, one is left wondering if "better late than never" might be of benefit.

    As a micro-anecdote, yesterday at my current employer we had a dev box fail, knocking out a load of users. This is just one, straight forward, self-contained machine with a handful of users that are across the office from us admins. Even now, we can't give a clear post mortem of what went wrong, and not for want of trying - we have genuinely spent some time looking into it. Sometimes even simplicity isn't a basis for being able to provide decent post mortems. Sometimes sh1t just happens.

    1. Ben Tasker

      It's a cost assessment as well

      Something I really struggle with (given that I have to know why something broke) is that sometimes you need to do such an indepth post mortem that it may not be worth the cost of paying someone to do it.

      Take your outage, say that cost the business £50 (made up number), if it's going to take you hours to thoroughly investigate then it may be better to accept that there's probably a small chance of it happening again (given you can't find anything too obvious) than to delve to the bottom.

      It's not something I've ever been any good at, if something breaks I need to know why. But it's something I've come across in my (soon to end!) current employment.

      1. Anonymous Coward
        Anonymous Coward

        I had a boss...

        Who used to interrupt my recovery work with "I need to know why this happened."

        I used to ask if it was not more important to get the thing working again.

        In my particular 40-50-user environment, with a small IT dept, it would have taken the sort of support contract that included sending off memory dumps for analysis. No, we did not have that kind of support. Nor would he have understood the answers if we had. Not sure I would have, to be honest!

        Of course, being a Unix site, this kind of stuff didn't happen often anyway.

    2. Bruno Girin

      Totally agree with you. However, keeping customers informed through a number of channels (blog, Twitter, etc) should be a good way to reduce the numbers of customer calls, shouldn't it? So why can't large companies do this properly? I would wager that's because they didn't spend the money having a disaster handling plan and when sh1t happens, it's complete panic. It could be as simple to have a named member of staff in PR tasked with sending regular updates during an outage to ensure that fewer customers pick up the phone.

  5. petur
    Boffin

    smaller companies seem to handle customers better

    or still have the intention to do so.

    When I became customer of my mobile provider, they were a small company with not too many customers, and support was great, got live person to chat/mail with, and issues got resolved in the blink of an eye.

    Now they've grown quite a bit, they start to behave more and more like the other big providers: mails take long to get an answer, if any, and news/info gets out slowly.

    I guess many growing companies don't see that when their customer base goes x10 or more, maybe their support staff should follow a similar curve....

  6. Anonymous Coward
    Anonymous Coward

    Any company can have a golden period where just the right people fill every critical post. Then one day some clueless jobsworth takes over and moves all the people round. Some leave due to the new culture of ignorance being developed. The rest get on with their new roll as some sort of customer liason while looking back at the flailing idiots now doing all the critical stuff. Rant over.

  7. Chris Miller

    The real test of a good supplier

    Is how they handle problems (and there will always be problems). The poor ones hide under the bed and hope it will all go away before anyone notices (occasionally they may get lucky). The good ones say something like:

    (a) this is the problem;

    (b) this is what we're doing to fix it;

    (c) this is how long we expect it to take; and

    (d) here are 3 options we can follow while the problem lasts.

    A little later they should come back with:

    (e) this is what happened; and

    (f) this is what we're doing to prevent a recurrence.

    It isn't difficult, but it seems to be beyond the capacity of quite a few suppliers (both large and small).

  8. Anonymous Coward
    Anonymous Coward

    Small companies

    I've worked for a few small companies and as it's small a) the support team is too small to have low-level grunts who follow scripts - basically you're getting through to a 3rd line engineer straight away b) the support team can take a 10 second stroll over to the devs/technical director and get answers quickly if needed, or even c) developers will be covering support at times (eg one support guy and if he's on the phone it will roll over to the junior dev).

  9. Anonymous Coward
    Anonymous Coward

    A lot of the time you get what you pay for - customers want Ford pricing but expect Rolls Royce service. The guy earlier was right - some services we sell may make $10-20 per year so as soon as you get more than 1 problem you are losing money on that customer.

    However, we take the view that it's better for the customer and overall cheaper for us to ensure the service is reliable - win - win.

  10. Anonymous Coward
    Anonymous Coward

    "The real test of a good supplier"

    Just as long as you follow the same standards you set for others all is well.

  11. Lunatik

    Andrews & Arnold are great for this kind of feedback and preventative information.

    I can echo AC 09:52 above about @BTCare, they are pretty on the ball and have helped me in the past, even going as far as to chase me to check it had been resolved to my satisfaction.

    Unfortunately, the point made by AC 10:28 probably also applies here; the use of Twitter for support has been, in the main, the preserve of the technically literate. This probably won't last and eventually some bean counter will look at the cost of supporting no end of BTards via Twitter and just pull the plug.

  12. K
    Mushroom

    How much did they pay for this?

    Now since I've read your article, how about splitting the kick back with us???

    I've used many "Top" DNS services in my time, and I can tell you now they all have pro's and con's.. but fact is if you've configured DNS correctly with multiple servers, you should NEVER have any of these problems..

    We run several websites that use PPC advertising such as Adwords etc, where clicks can cost from £1 to £50, so even a single DNS failure for us can be expensive. So we use Multicast DNS providers, used to be with UltraDNS, but couple of years ago shifted to the Dyn.com Dynect service... yes they charge a fortune, but its a great service.

    1. Evan Essence
      Thumb Down

      You obviously didn't read the blog post linked to in the article: the problem was a DDOS attack against three anycast constellations. Don't tell me your provider would be able just to shrug that off.

  13. Anonymous Coward
    Anonymous Coward

    I already get happy with a recorded message:

    "We royally screwed up somewhere, have no idea how to fix it, and probably will solve it in 24 hours".

    At least there is a recorded message.

    Specially from a cable company, that when screws up, leaves entire neighborhoods in the dark. And all of them will call at the same time.

    At least I know I didn't trip over any cables at home.

    PS. In my case, the entire city, and other ISPs were out too, so it was not their fault.

    1. Kirbini
      Trollface

      One better

      I don't trust any of the bastards so we went one better and built our own multi-site *anycast* dns (there's no such thing as multicast DNS). Multi server clusters in geographically dispersed locations.

      Of course, we own the datacenters so it was really a no brainer.

  14. Timmay
    Facepalm

    Is this news?

    I can see the headline now:

    Small Company Gives Better and More Personal Support Than Large Faceless Corporation Shocker!

  15. ElNumbre
    Thumb Up

    Lies, Damn Lies and Statistics.

    A++ for any supplier which puts its nads on the plinth when there is a fault - no bullplop is refreshing. Id rather understand why something broke, than just being told it broke.

    However, the worst thing in the world is when they just "make something up". We get that with one of the major telco's (or rather one of its incestuous divisions). I've lost track of the number of times I've seen a response of 'end-user equipment' when every link into an exchange has gone down. Or, 'right when tested' even though there is massive noise on the line when previously there wasn't. And unfortunately, since said major telco got rid of its expensive 'old hands', it seems to be standard training to the new apprentices that its acceptable to lie to customers, especially if there a service-visit charge can be applied later down the line.

  16. Edward Ashley

    I Concur

    I have used easydns for many years, however I am now moving to Amazon Route53. The recent outage from BT seemed to affect the easyDNS service, but not the Amazon service. I don't think it was easyDNS' fault, however when the boss is sitting at his computer saying he can get on everything else but not his site I have to explain. However I second the fact that their support is really good and I would recommend them.

    1. Trevor_Pott Gold badge

      EaseDNS and route53

      EasyDNS has set up some thing wherein you can easily mirror/move/etc your DNS to Amazon. Haven't tried it yet, but seems to be their Next Big Thing in helping to prevent DDoS issues from affecting customers; help them more easily run on mutiple providers.

      Claims were made apout a single interface that would change DNS on both services, but I cannout really vouch for I have not yet played with.

      1. arnon (easyDNS guy)

        Hi Trevor, I just noticed that and made a comment regarding it. Hope that's not considered inappropriate.

    2. arnon (easyDNS guy)

      In regards to the DDoS

      bias alert : I work for easyDNS. If this is inappropriate, I apologize and ask that it be deleted

      Hi Edward,

      A large part of why we got hit harder than others is that the folks doing the attack actually cached the IPs of the authoritative nameservers for the target domain after they started jumping from DNS provider to DNS provider. I couldn't possibly explain their reasoning (since it makes no sense), but that seems to be the case. I don't believe Amazon could have weathered the magnitude of the hits we took any better, but you have the right idea.

      Instead of moving AWAY from us, however, allow me to make a recommendation : use both.

      easyDNS provides a route53 interface that lets you continue to delegate your domain to both our anycast clusters and the Amazon ones, and automatically update the AWS ones through their API. If they get hit, remove them from the delegations. If we start getting hit, remove us. Either way, you have hot-swappable DDoS mitigating control.

  17. Anonymous Cowherder
    Mushroom

    ETA!!!!

    If there is one thing likely to make me blow my stack it is "How long will it take?"

    Usually the answer is "I have no effin idea but one thing I can tell you is that the longer you stand there asking me stupid questions is only increasing the time it will take for me to identify the problem and then resolve it."

    If you want me to guarantee that it doesn't come crashing down again in 30 seconds after that it will take a bit longer as I have to check that the issue that is affecting you isn't merely a symptom of a bigger problem. This bit usually takes longer.

  18. nsld
    Paris Hilton

    Too much info can be dangerous

    One lesson we learned as a company was that over communicating to customers during a DDOS attack helped the attackers vary the attack vectors as they followed our updates to see what we where doing.

    The problem with public facing social media is that it is public facing as opposed to just client facing so every bugger gets to watch you go down in flames, screaming all the way.

    And when asked for a fix time I reach for the arse of the magic elephant and pluck one out of there, its not the same as Paul the octopus but its predictions arent too bad.

    1. arnon (easyDNS guy)

      I can completely understand that perspective. There's a difference, we feel however, between letting clients know that things are being worked on and airing the private laundry, as it were. Frankly, we fell down on that on the last DDoS, and we're hoping to do better now. A lot of the feedback we got from those who follow twitter was how happy they were about the constant flow, and from folks who were checking our blog how frustrated they were that they didn't know what was going on.

This topic is closed for new posts.