back to article IT Pro confession: How I helped in the BIGGEST DDoS OF ALL TIME

I contributed to the massive DDoS attack against Spamhaus. What flowed through my network wasn't huge - it averaged 500Kbit/sec – but it contributed. This occurred because I made a simple configuration error when setting up a DNS server; it's fixed now, so let's do an autopsy. The problem I should start off by apologizing to …

COMMENTS

This topic is closed for new posts.

Page:

    1. Trevor_Pott Gold badge

      Re: Kessel Run?

      13 hours and change. In my defence, I was asleep for most of it...

  1. Anonymous Coward
    Anonymous Coward

    "The keen eye will notice two other flaws in my server design. The first is that BIND isn't chrooted. This is because the spywaredomains.zones file from malwaredomains isn't really designed with RedHat-based operating distros in mind. If you were to chroot bind you'd have to post-process the zone file to cope with the path differences."

    The paths are relative to the chroot, so say you chroot in /var/named, you could just copy the blockeddomain.hosts file to /var/named/etc/namedb/blockeddomain.hosts. No post-processing needed. Shared virtual hosting and fail2ban have nothing to do with this - chrooting BIND is there to make it harder to exploit bugs in BIND.

    "The second is that DNSSEC isn't enabled."

    One of the reasons why this is such an efficient traffic amplifier is that a DNSSEC signed zone can have a much larger response to a (small) query than an unsigned zone. Not saying that it isn't useful, but DNSSEC does require care, really wants modern software probably with rate limits (which are non-default build options / patches in most current implementations), and keeping track of development. There are exciting new opportunities to break your DNS with it too, of course.

    Since there were a few comment suggesting djbdns for inexperienced admins - oh $DEITY no, please....

    1. Trevor_Pott Gold badge

      The particular implementation of BIND + chroot utterly refused to look in the chroot directory for /etc/namedb, no matter how much tinkering I tried. I gave up eventually and left it. As for the shared virtual hosting and fail2ban comment, that is there because most of the "bugs in BIND" we might care about are exploits that work if you have manged to gain a remote console.

      SSH on an alternate port + fail2ban + not actually giving the information to anyone and having a very small user footprint means your chances of getting into the system to exploit BIND in that fashion are hella slim. There is always the remote possibility that you could use some sort of remote attack against BIND like that, but the chances are even smaller. In terms of the risk posed, I think I can get away with not chrooting the thing for the 2-3 moths between initial roll out of the service and the replacement of the unit with a CentOS6 box.

      At least on CentOS6 the bloody chroot works right and the malwaredomains zone works without post-processing the text file. I should also point out that the DNSSEC implementation set up in CentOS6 is actually pretty good.

  2. Anonymous Coward
    Anonymous Coward

    "edge scrubber"?

    I almost stopped reading there, but you lost me at honeypot. As far as I know, a honeypot is a place where you profile/catch attackers. Wikipedia says:

    "In computer terminology, a honeypot is a trap set to detect, deflect, or in some manner counteract attempts at unauthorized use of information systems. Generally it consists of a computer, data, or a network site that appears to be part of a network, but is actually isolated and monitored, and which seems to contain information or a resource of value to attackers."

    I don't think honeypot is the correct term to use when setting up a machine for the security of your own users. I like your article Trevor, regardless of your DNS "sin" - yes, it really is that bad. However, as an tech author, you should really use common place terminology. Since I'm not in the UK, I could be wrong, and that is how IT professionals talk over there.

    1. Trevor_Pott Gold badge

      Re: "edge scrubber"?

      Yes. A honeypot is indeed where you profile and catch attackers. Why are you hitting the honeypot machine if you aren't clicking on stupid things or are an attacker? They honeypot allows me to catch not only attackers but stupid users. I would say that "redirecting a user to a honeypot machine that displays an error or educational message when they try visiting a site on the list, then logs the thing so I can find and LART someone" counts as a honeypot.

      As for edge scrubber, the system also does IDS and DPS. It scrubs my datastream. It leaves on the edge of my network. What the hell would you call it?

      If it's a ship and it goes through the gate, you call it a gateship. You only call it a puddle jumper if you need something that sounds good on TV. It's an edge device, it scrubs my datatream. Should I call it a boysenberry?

      1. Adam JC
        Thumb Up

        Re: "edge scrubber"?

        Upvote for the 'puddle jumper' reference. :-)

        1. Fatman

          Re: "edge scrubber"?

          Add one more for me, too!

      2. Fatman

        Re: ...if you aren't clicking on stupid things

        That is what WROK PALCE does to prevent damagement from viewing pr0n sites.

        Boy, did some male execs scream when the female CEO got an email listing their attempt in real time to access a prohibited site. She often paid a visit to their office - unannounced!!!!

        IT got called some really nasty names; but, what the hell, we were only following ORDERS!!!!!

  3. Steve Knox

    Caching?

    If someone asks your server "where is www.google.com" a whole bunch of times then your server starts flooding google.com's DNS servers.

    Not if your server is set up to cache recursive results for a period of time (which I believe is the default.) More likely attackers are asking for:

    www1.google.com

    www2.google.com

    www3.google.com

    .....

    which would result in multiple lookups even if caching is enabled.

    Feel free to correct me if I'm wrong.

    1. Trevor_Pott Gold badge

      Re: Caching?

      Nope, you are 100% correct. If you are attacking properly that is exactly how you do it. (Actually, it is is the DNS for www.google.com you want to take down you attack with 1.www.google.com and 2.www.google.com etc.) That said, I was a little out in the weeds on describing the attack as is, and the sysadmin blogs are supposed to be 600 words. Had to leave out some details somewhere. :)

      1. koolholio
        Boffin

        Re: Caching?

        *interjects* You would need a method of applying an address answer limit... but then surely this could also be covered by:

        http://tools.ietf.org/html/rfc2827 or http://tools.ietf.org/html/bcp38

        it says primarily about forged packets, I assume that would be dns spoofing or even related to cache poisoning? Is there a difference between the two?

        1. Trevor_Pott Gold badge

          Re: Caching?

          Network ingress filtering requires you be "part" of the wider internet, rather than merely the equivalent of a consumer with a fat pipe. We don't have access to BGP. We have no way of seeing, processing or acting upon the internet's wider routing table. Without this, the sort of ingress filtering duscussed in those documents simply isn't possible.

          So what's left? Whitlisting systems manually that you want to connect to your DNS in iptables? How's that work when some of those units are mobile? Users with dynamic residential IPs, connecting from hotels or even over mobile links? What we really need is a DNS server and client infrastructure that allows for authentication of clients before they can look things up. DNS + TLS if you will. It might be time to start building something internally similar to opendns' infrastructure. I'll give it a thought.

          1. koolholio
            Boffin

            Re: Caching?

            Or, depending upon your network setup... you could implement the use of a router/switches iptables/netfilter (provided it has --match --hexstring and --algo filters) by matching the request for the recursive flag set on usually UDP packets inbound at a certain offset. I believe iptables/netfilter is included within most linux and unix distros. Zeroshell (a linux based router distro) may even allow you to enter raw commands to utilise this.

            Wireshark is useful for finding what offset and the dns query flags--- which is the hex string you wish to filter for... you may also apply a rate limiter using the same patterns, but with the rate respectively.

    2. Alan Thompson
      Stop

      Re: Caching?

      Not quite right -

      ----snip----

      If someone asks your server "where is www.google.com" a whole bunch of times then your server starts flooding google.com's DNS servers.

      ----snip----

      The correct statement is:

      If someone asks your server "where is www.google.com" a whole bunch of times while spoofing a source address of [one of spamhaus's external IPs] then your server starts flooding spamhaus's external IP address with large DNS replies. Local caching means nothing.

      Then spamhaus blacklists your IP address

      Then all of your email firewall's requests to spamhaus start being blocked

      Then you can't evaluate incoming email traffic against spamhaus' database

      Then you start letting spam in.

      THEN since this is a DDOS attack from many improperly configured DNS servers, spamhaus' servers go offline.

      This is a DNS amplification attack because small amounts of DNS specific traffic from one group of attackers to a single DNS server results in large amounts of traffic to the victim.

      1. Trevor_Pott Gold badge

        Re: Caching?

        That is one potential variant of the attack, yes. It is not the only one. There are a few others too. Oh DDOSes, so many of you out there!

      2. This post has been deleted by its author

      3. koolholio
        WTF?

        Re: Caching?

        so are you saying that TTL, expiry and any cache including an EDNS0 cache timeout are redundant and are of no effect in relation to caches and if that is the case... caches may aswell not exist...

        If that is the case, I also think a cached response shouldn't have its own flag assigned to it?

  4. Sixtysix
    Pint

    Top banana

    A post that can be logged as CPD after reading.

    Win win - virtual pint on me.

    Cheers

  5. Anonymous Coward
    Anonymous Coward

    Open DNS fishing

    I noticed small bursts of outsiders attempting recursive queries from 12 March. All were rejected as 'outside'. The bursts grew in size until 23rd March then stabilised until 28th. Not much seen since.

  6. Justin Clements

    Losing Sleep?

    To be honest, I wouldn't lose sleep over accidentally running a DDoS on Spamhaus. Everyone in the industry has been frustrated by them at some point in the past, and frankly, they are pretty much getting what they deserved.

    There are plenty of other organisations who provide the same service but with less attitude.

    1. leexgx
      Happy

      Re: Losing Sleep?

      spam supporter detected

      1. leexgx
        Megaphone

        Re: Losing Sleep?

        email would be practicably be useless with out Spamhaus and the email servers that use there lists

  7. Alan Thompson
    Alert

    Publish External DNS to Your ISP - Maintain Local Control

    Whenever I set up a new network/DNS zone, one of the first things I do is to configure the external version of the zone as MASTER on the edge DNS server (similar to your scrubber). However, my ACLs prevent external access from the Internet to DNS except by my ISP's DNS servers. I then configure (or request configuration - if the ISP is still in the dark ages) the zone on the ISP DNS servers as SLAVE zones with matching SLAVE entries on my MASTER. The domain's ICANN registered servers are then configured as the ISP's DNS servers. This serves several purposes:

    1) All external DNS requests go the ISP's "properly configured", high throughput DNS servers

    2) If my edge server needs to go down for maintenance it doesn't take external DNS offline.

    3) The network admin maintains operational control of the domain and can do all the updates locally on the edge server

    4) The edge DNS server's IP address is never published as a DNS server for the domain

    5) The edge DNS server only handles zone transfers/updates to the ISP's DNS servers while maintaining its MASTER status.

    6) Edge devices on the local network can do local-external and recursive lookups on the ISP's DNS servers while internal devices use internal DNS servers (especially when using private addressing).

    I ALWAYS use a completely separate set of internal DNS servers and MASTER/SLAVE zones for internal authoritative access and recursive lookups - which also gives me the ability to blacklist bad domains there.

  8. Alan Brown Silver badge
    WTF?

    oh for fuck's sake.

    The "fix" is easy.

    in general options you set "allow query {localnets;}" (and any networks you think should be allowed to make general recursive queries)

    Then in each zone file you add "allow query {any;}"

    Porblem solved. You won't send answers for domains you're not authoritative for, except to explicitly defined networks

    It's not fucking rocket science, it's not hard and above all, it was what I was recommending 15 years ago to keep leeches off of DNS servers. You don't need DNSSEC or any of the other bullshit to reduce the nuisance factor of an open DNS server.

    Additional hacks to rate limit responses have been published. These and DNSSEC help a bit, but not as much as the simple (in most cases 1 line) config change above.

  9. Anonymous Coward
    Anonymous Coward

    Pleb

    Pleb goes in all fields

  10. taxman
    Big Brother

    And from Preventia:

    Whilst I appreciate this is a known problem, it seems to be an area of increasing risk as the attacks only in the last few days there was the largest DDos attack to date. This 300 Gbp/s attack is making ‘the internet’ shake.The internet will fail before Prolexic would (we are a virtual second internet for our customers). With 800 Gbp/s attack bandwidth available (and in the process of tripling that), Prolexic are the ONLY service/solution/product in the world that could handle an attack of this scale and duration.

  11. John Smith 19 Gold badge
    Unhappy

    I wonder if most other servers were as "badly" configured as Trevor's?

    Actual human being alerted to suspicious behavior

    Auto throttling of bandwidth cutting in even before a human response.

    I think if they had the answer would be "quite a lot better than what actually happened."

    Hopefully this will have given various sysadmins a wake up call to review their configurations and tighten up their procedures ( Unless the proverbial PHB puts their foot down and insists it cannot be changed because it would inconvenience the CEO)

    This presumes some of them even realized they were involved of course.

    1. Alan Brown Silver badge

      Re: I wonder if most other servers were as "badly" configured as Trevor's?

      Yes they are. Wide open is the default setting for Bind. Even DJB and MS wwere wide open last time I looked.

      It's the same mentality which STILL defines any DNS entries in zonefiles with zero padding as octal, despite the RFC explicitly stating that IPv4 addresses are dotted decimals. I got royally flamed when I pointed that particular "issue" out 18 years ago and asked that the RFC or the software be altered for consistency (given they were written by the same person, it didn't seem to be an unreasonable request). Not long after that, spammers started using dotted and long hex/binary/octal/decimal URLs in spam (It took filter authors to nail that down. Bind is still open to that abuse)

      1. John Smith 19 Gold badge
        Meh

        Re: I wonder if most other servers were as "badly" configured as Trevor's?

        "Yes they are. Wide open is the default setting for Bind. Even DJB and MS wwere wide open last time I looked"

        I was thinking the the human alerting of exceptional behavior and the auto throttling until the cause was investigated.

        That part of his configuration.

  12. Anonymous Coward
    Anonymous Coward

    one absolute solution

    Using openresolverprojects statistics, 25 million copies of this please:

    http://www.cloudshield.com/solutions/SP_DNS_Protection.asp

  13. koolholio
    Boffin

    Possible solutions for the opensource community

    http://www.ntop.org/products/ndpi/

  14. koolholio
    Boffin

    Possible to detect and monitor... but not so easy to filter out

    You can capture just dns requests from a dns server itself using a capture filter, such as this one:

    "<CONNECTIONTYPE> host <GATEWAYMAC> and src net <LOCALNET/CIDR> or not src net <LOCALNET/CIDR> and port 53" (optionally omitting "and udp" and changing the port if configured differently)

    of course you can specify destinations respectively, if you're doing this further upstream by using:

    host <IP> or net <IPRANGE/CIDR> or mask <netmask> if its over multiple subnets

    Which will capture all requests and responses to and from... Heres where it gets difficult:

    You would just need to apply filters to this, using pattern matching for distinguishing characteristics but there may be need for utilising comparisons within the filters.

  15. Daniel B.
    Boffin

    So I'm not alone!

    Last year I decided to switch my DNS pointers from the hosting service I have (GoDaddy) to my own. Alas, I forgot that while ns2 had the "recursion disabled by default" setting, ns1 *didn't*.

    2 weeks later, I check out my bandwidth usage and notice that it's waaay off chart. Monkeying with iptables, I was able to pinpoint the extra traffic to port 53. Firing up tcpdump gave me a zillion DNS requests for some weird domain, which itself was pointing to CloudFlare as well. Ouch! I outright blocked port 53, sending it to DROP. I even switched the DNS order ... but I did notice that ns2 didn't get the zillion requests. So I went on checking and finally found out about both the open recursion configuration, and the default config switch. Even after securing my ns1 BIND, I still had to leave port 53 blocked on my main DNS 'till the request flood died out. It cost me a lot in bandwidth that month, but lesson learned...

  16. Emo

    Scrubber

    I'd be interested in some sort of Reg Tutorial on putting something low cost together and configured, even if only for home use.

  17. Glen Turner 666

    Old, old attack.

    It's not rocket science, I described the correct configuration for AusCERT back in 1999 in response to DDoS we were seeing then. (Modify the "bogon" list for the newer "end of IPv4, so let's use every Class A possible" list of bogon networks.) See AL-1999.004 at http://www.auscert.org.au/render.html?it=80

Page:

This topic is closed for new posts.