back to article Open-source 64-ish-bit serial number gen snafu sparks TLS security cert revoke runaround

A bunfight over a controversial UAE mobile security company led to the discovery that millions of TLS security certificates have been improperly issued – thanks to a dodgy default configuration in popular certificate authority (CA) management software. During a discussion on the mozilla.dev.security.policy group about …

  1. Roland6 Silver badge

    Confusion due to lax use of terminology in RFC?

    Section 4.1.2.2 of RFC5280 talks about certificate "Serial Numbers" not 'keys'. The article also confuses the two.

    Interestingly, reading both RFC and X.520, no mention is made as to how the Serial Number is to be implemented. If I were writing in C, I would implement it as an Unsigned Integer - which is always a positive integer. Which is consistent with the statement: "CAs MUST force the serialNumber to be a non-negative

    integer.". thus all 64-bits are available... So, I disagree with Adam Caudill summarisation.

    Hence the problem is that the open-source key-generation package, EJBCA, is non-compliant with RFC5820.

    But fundamentally, the problem is the imprecise language used in RFC5280...

    1. Jim Mitchell

      Re: Confusion due to lax use of terminology in RFC?

      I agree, this is confusing, and the article doens't help. 1 is a perfectly valid randomly generated positive AND non-negative 64bit integer, for example. Is that a problem, and why?

      1. diodesign (Written by Reg staff) Silver badge

        Re: Re: Confusion due to lax use of terminology in RFC?

        '1' is a perfectly valid cert serial number, yes. There is no problem with it. The problem is that no serial number would be generated with the top bit set, halving the number of available serial numbers and increasing the chance of collision.

        C.

        1. G2

          Re: Confusion due to lax use of terminology in RFC?

          obligatory XKCD reference here: https://xkcd.com/221/

          =)

        2. Loyal Commenter Silver badge

          Re: Confusion due to lax use of terminology in RFC?

          To put this into perspective:

          Imagine you have a chess board (which convieniently happens to have 64 squares on it). You also have a bag of 64 pawns. Your job is to put a random number of those pawns onto any arrangement of squares on the board and not get the same arrangement as someone else.

          This flaw means that you can never put a pawn on the bottom right square. It has no impact on those other 63 squares. Embarassing, yes, but it doesn't exactly 'dramatically reduce' the number of available combinations you have (It halves an already pretty big number, you still have almost 10^19 combinations to play with, which to put it in perspective is roughly the number of human cells in the population of swansea).

          1. Anonymous Coward
            Terminator

            Re: Huh?

            "which to put it in perspective is roughly the number of human cells in the population of swansea".

            I want to know *how* you know that specific fact about Swansea... no, come to think of it, not how, but *why*.

            Also, AFAIK those are nice people, so what are you planning?!

            1. Loyal Commenter Silver badge

              Re: Huh?

              I randomly picked a town where the population is about 250k. The number of cells in the human body is 30-40 trillion, and the population of Swansea is apparently 241,300. 2^63 is approx 9.2 x 10^18 so this works out pretty spot on if you take the number of cells in the human body as 38.2 trillion...

              So, if you took every single cell of every single human being in Swansea and allocated each a random 63 bit ID, you'd expect a 50/50 chance of getting a previously assigned number when you got to around the 120,000th person.

    2. diodesign (Written by Reg staff) Silver badge

      Re: Confusion due to lax use of terminology in RFC?

      To be clear, the problem is all about certificate serial numbers, and nothing to do with keys. I've cleared out any mention of keys to avoid any confusion.

      The issue is that serial number length must be at least 64-bits and a positive integer. To ensure this, the generation software was keeping the top bit clear, effectively reducing the default 64-bit integer to 63 bits.

      C.

      1. richm

        Re: Confusion due to lax use of terminology in RFC?

        Unfortunately your comment is still not quite right - the serial number must contain 64 bits of entropy, and be a positive integer. That actually allows a given serial number to be shorter since in some cases the random digits at the start will be 0. This matters because DER (the encoding used in certificates) requires that the shortest encoding be used, so if there are lots of leading zeros they're omitted. The fact that the serial numbers were all 8 bytes long (64 bits) rather than some being 9 bytes is how it's easy to see the issue is present.

        As you note, it's in some ways not a big deal at first glance - but then again that's exactly what people said about various timing attacks etc. that subsequently caused all sorts of problems. Also getting the details of the rules right is exactly the CA's job...

        1. diodesign (Written by Reg staff) Silver badge

          "Unfortunately your comment is still not quite right"

          Well, I'm trying to keep it simple here in the comments. Thanks for the extra info.

          C.

      2. Roland6 Silver badge

        Re: Confusion due to lax use of terminology in RFC?

        >The issue is that serial number length must be at least 64-bits and a positive integer.

        RFC5820 does not mandate a minimum length for the serial number, just a maximum length.

        Addendix C gives examples using serial numbers: 17, 18 and 256. the length of the serial number field in the certificates being 8, 8 and 16 bits respectively.

        In fact, given what we know now, example C.3 seems to have been given to show that the Serial Number field is signed, as 16 bits are used to encode the 8 bit value.

    3. Si 1

      Re: Confusion due to lax use of terminology in RFC?

      Yeah, I don’t understand this, why did they use a signed 64-bit integer when negative numbers are never used?

      1. Tom 7

        Re: Confusion due to lax use of terminology in RFC?

        If this is a certificate serial number then its going to take a long time to actually run out of numbers. I will have to have a sleepless night to consider reading the RFC but if its like other serial number would it not be partitioned and so the only chance of a collision if some organisation used up all its number range and somehow didnt notice?

        1. Ben Tasker

          Re: Confusion due to lax use of terminology in RFC?

          No, you're thinking of something more akin to an incrementing serial there.

          So you might have

          ourcert-1

          ourcert-2

          ourcert-3

          etc.

          That's the approach that *used* to be in place. But, it has a number of issues. You cannot guarantee that a situation will never arise where you mistakenly use the same serial twice - for example, if your process crashes mid-issuance, and then another cert is issued, you may have a part-issued cert, and a fully issued cert (for someone else) sharing serial number "3" (or whatever). There are various possibilities in that area.

          It also potentially poses an issue if you're distributing your issuance system globally, though that's more easily addressable by inserting a region into the serial.

          What the RFC requires is that CAs include a minimum of 64 bits entropy in the serial (some CAs weren't affected by this issue because they were already using more). The serial can be more than just that entropy though, so you might choose to keep your increment and append the entropy to the end

          ourcert-1-xx:xx:xx.....

          ourcert-2-xx:xx:xx.....

          Now, if you have the same issue with a crash (or whatever) you may get two certs allocated "3" but the likelihood of their serials being identical is incredibly small.

          > if its like other serial number would it not be partitioned

          There's no partitioning no, there aren't ranges allocated out like with (say) Mac Addresses. It's literally the output of a RNG, the issue here is the reduced the namespace by forcing one bit to be a specific value (by discarding all results where that wasn't the case)

          1. Anonymous Coward
            Anonymous Coward

            Re: Confusion due to lax use of terminology in RFC?

            "That's the approach that *used* to be in place. But, it has a number of issues. You cannot guarantee that a situation will never arise where you mistakenly use the same serial twice - for example, if your process crashes mid-issuance, and then another cert is issued, you may have a part-issued cert".

            Yes you can, but I suppose most coders these days don't actually know how to write distributed systems, even though they all want to use containers and write micro services.

    4. mj.jam

      Re: Confusion due to lax use of terminology in RFC?

      This comes down to the fact that ASN.1 is used in the certificate. The RFC uses an INTEGER type, which is signed. This means anything reading the certificate must treat this as a signed number. The size of the integer can be varied, and the RFC says up to 20 octets. Obviously you could take a 64 bit unsigned, and if the top bit was set, encode this as 9 octets, and if it was unset use fewer (right down to 1 for some very small serial numbers), but clearly they decided to use a signed value.

      The IETF certificate on https://tools.ietf.org/html/rfc5280 uses 9 octets for exactly this reason, top octet is 0.

  2. Yet Another Anonymous coward Silver badge

    Why is this a big deal?

    It seems odd that 63bit is a terrible lapse in security allowing certificate serial # to be guessed by every passing script-kiddie for the lulz

    While 64bit was a super secure defence against even the most black clad of North Korean cyber-ninjas for the forseeable future.

    Seems an odd coincidence that MAX_INT on most systems is exactly the correct key length for optimal security

    1. diodesign (Written by Reg staff) Silver badge

      Re: Why is this a big deal?

      As we said a few times in the article, it's not a big deal for normal folk. There is still 63 bits of certificate serial number space.

      It's just a bit - get it? - embarrassing for the usually by-the-book world of cryptography. And an interesting or amusing bug that we thought Reg readers would appreciate.

      C.

      1. JeffyPoooh
        Pint

        Re: Why is this a big deal?

        "...supposedly 64-bit serial numbers in its certificates were in fact one bit short, the top bit being always zero to indicate a positive integer."

        They've corrected it, and are now feeling smug. The top bit will now NEVER be zero.

        ;-)

        1. Loyal Commenter Silver badge

          Re: Why is this a big deal?

          They've corrected it, and are now feeling smug. The top bit will now NEVER be zero.

          Don't be silly, the value for the bit has to be evenly distributed between ones and zeroes. They've actually decided to look at the phase of the moon and if it's waxing, it's a 1, and if it's waning, it's a 0. When it's a full moon, all certificates turn into wolves.

      2. Roland6 Silver badge

        Re: Why is this a big deal?

        >And an interesting or amusing bug that we thought Reg readers would appreciate.

        In this context I agree. It also serve as a reminder to those writing specifications: be precise, if necessary be verbose and use widely used conventions: which programming language uses "Positive Integer" as a data type declaration?

        Confusion could have been avoided if the authors of the RFC section 4.1.2.2 had simply specified:

        1. The serialNumber field as being an unsigned integer of variable length up to 20 octets (ie. 20x8 bits).

        2. Removed all reference to negative numbers and non-compliant CAs, because a CA that treats all serialNumbers as Unsigned Int will automatically gracefully handle certificates with serial numbers

        that are negative or zero ie. certificates with signed int serialNumbers.

        1. Roland6 Silver badge

          Re: Why is this a big deal?

          Digging a little deeper, I think a big cause of the problem is that ASN.1 (used in RFC5820 to define the structure of a certificate) seems to only have the data type "Integer", with the exact meaning of that term being "depending on constraints specified " in a specific specification ie. in the text of the specification, in this case RFC5820.

          Funny I missed that and obviously have forgotten ASN.1 (not had to use it for nearly 30 years), given the origins of ASN.1 - somewhere in the mists of defining OSI PDUs, it is a little surprising that it is so vague and allows for ambiguous interpretation. Additionally, I have discovered that it is known that this ambiguity has caused problems with ASN.1 decodes over the years, yet no one has seen fit to revise the ASN.1 specification...

          1. The First Dave

            Re: Why is this a big deal?

            So, I'm still confused: Google / Apple et al are apparently all about to revoke existing certificates, but other people say it has no direct effect on security, so why would they bother to revoke?

            1. Roland6 Silver badge

              Re: Why is this a big deal?

              >So, I'm still confused: Google / Apple et al are apparently all about to revoke existing certificates

              It's in the rules of being a a fully fledged cert-issuing CA...

              [see Section 4.9.1.1

              Obivously, quite a few companies seem to have either used the same software, complete with the default settings or simply assumed that a 64-bit certificate serial number was sufficient to hold the 64 bit output of a CSPRNG...

              Mind you, it would not surprise me if GoDaddy finds that using the max possible - certificate serial number of 20 octets (20x8 bits including sign bit), uncovers a large number of non-compliant clients...

              .

  3. JeffyPoooh
    Pint

    Jeffy's Theorem of Binary Digit Distribution

    Jeffy's Theorem of Binary Digit Distribution: Approximately 50% of all 64-bit numbers have a leading '1'.

    A closely related conjecture, as yet unproven. This may have some bearing on the present issue.

    Jeffy's Conjecture of Binary Digit Distribution: Approximately 50% of all 64-bit numbers have a leading '0'.

    1. -tim
      Boffin

      Re: Jeffy's Theorem of Binary Digit Distribution

      About half of data streams should have a leading 0 but a vast majority of numbers in a computer have a leading 0. When looking at raw data in a computer when doing reverse engineering, pointers will often have their top bits set but not look like negative numbers. Most other numbers have at least their top 4 bytes all zeros. Modern CPUs move around so many 64 bit numbers that are mostly zero bits that the power use is optimized for it.

  4. Pirate Dave Silver badge
    Pirate

    well

    look on the bright side, they now know for sure that they aren't even half-way out of available serial numbers yet.

  5. Anonymous Coward Silver badge
    Holmes

    Really

    Were they really relying on a 64(ish)-bit RNG to never produce a collision, or where they also checking that the output was unique too.

    Personally I would never sign-off on the former because Murphy's law will kick in at some point. If something has a 1 in a billion chance of going wrong, it will do so 9 times out of 10.

    1. Claptrap314 Silver badge

      Re: Really

      https://en.wikipedia.org/wiki/Birthday_problem

      So, it's about one in 4 billion. So we're fine.

      1. Michael Wojcik Silver badge

        Re: Really

        So, it's about one in 4 billion.

        Right. For an even distribution across a range of N possible values, the collision probability is 1/sqrt(N). The square root of 2N is 2N/2, so for a 64-bit value, it's 232.

        If any CA ever got close to the point where a collision was a worry, they'd just start using longer serial numbers. Every two bits added to the serial number length halves the probability of a collision (or, equivalently, doubles the number of certificates you can issue at the same risk of a collision).

        Also, for the GP post: Where does it say that anyone used a CPRNG with ~64 bits of entropy to generate their serial numbers?

    2. Locky

      Re: Really

      If something has a 1 in a billion chance of going wrong, it will do so 9 times out of 10.

      So we need to even the odds up a bit. How about you generate it blindfolded, standing on one leg and with one arm tide behind your back.

      Then the odds are about 1 in a billiion....

    3. Loyal Commenter Silver badge
      Boffin

      Re: Really

      Any statisticians here want to work out how many certificates they'd have to issue with (properly) random 64 bit ids before they have a 50% chance of getting a collision with previously issued cert? How about 63 bits?

  6. Claptrap314 Silver badge
    Paris Hilton

    Color me unimpressed

    We the folks that coded this library first-timers? Had the folks that decided to use this code never heard of code reviews? Or did they fall asleep for the five minutes that their professors covered the esoteric subject of unsigned vs signed integers?

    I implemented an IEEE-754 emulator at AMD. (As a mathematician turned programmer.) I learned the hard way how worthless these standards tend to be.as implementation guides.

    If you attempt to implement a standard by simply studying what it says & writing your code based on that, expect to embarrassingly fail.

    It sounds like the folks who wrote the standard should be drummed out of the industry, along with whomever signed of on using this library. The implementres either reeducated or retired, depending on seniority.

    1. Claptrap314 Silver badge

      Re: Color me unimpressed

      Of course, the down voters don't explain themselves.

  7. Bronek Kozicki
    Trollface

    Whoahaha

    Another proof that Java programmers do not understand unsigned numbers.

  8. Jamie Jones Silver badge
    Happy

    If I owned a CA hit by this issue...

    I'd simply say there was never an issue, and that all my randomly produced serial numbers coincindentally happened to have the top bit unset!

    1. Richard 12 Silver badge

      Re: If I owned a CA hit by this issue...

      As a company, you'd be dead within a week.

      Crypto relies on a good understanding and implementation of randomness.

      Saying that one of the "random" bits was the same in every cert you've issued would be saying that you don't understand random numbers and can't properly implement it, so nobody should trust you.

      Much better to admit you've only been using 63 bits of random and not 64, and fix it - after all, the fix is pretty easy if the rest of the implementation is correct.

      These serial numbers are public, so it's not like you can actually pretend it didn't happen.

      1. Roland6 Silver badge

        Re: If I owned a CA hit by this issue...

        >Much better to admit you've only been using 63 bits of random and not 64

        If you are issuing certificates, as DarkMatter were, that for some reason always contained a 64 bit serial number (because that is what you've configured your CA to do) then because of the ASN.1 DER encoding rules used by RFC5280 you only have room in your certificate serial number field for 63 bits of randomness. Obviously, there is a problem as the current CA/Browser Forum Baseline Requirements require CA's to use at least 64 bits of output from a CSPRNG (crypto random number generator) in the certificate serial number.

        Having done a bit of reading around this, it is clear there has been some misunderstanding and corners cut a little too tightly. Basically, for your CA to be able to issue certificate serial numbers containing fully 64 bits of randomness from a CSPRNG, it has to be able to issue certificates with serial numbers with lengths up to 9 octets (72 bits).

        >These serial numbers are public, so it's not like you can actually pretend it didn't happen.

        Trouble is the evidence seems to be statistical on a very small sample. Whilst from the math it does seem unlikely, it isn't improbable...

        From the DarkMatter evidence, I would be more concerned that all 235 certificates logged to Certificate Transparency logs have serial numbers that are in a comparatively narrow range of numbers. Namely, all had numbers that resulted in serial numbers with the most significant bit being set as being between 59 and 63, with the majority having bit 63 set.

      2. Jamie Jones Silver badge

        Re: If I owned a CA hit by this issue...

        I guess you missed the smiley face....

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon