Open-source 64-ish-bit serial number gen snafu sparks TLS security cert revoke runaround • The Register Forums

Wednesday 13th March 2019 18:53 GMT Roland6

Confusion due to lax use of terminology in RFC?

Section 4.1.2.2 of RFC5280 talks about certificate "Serial Numbers" not 'keys'. The article also confuses the two.

Interestingly, reading both RFC and X.520, no mention is made as to how the Serial Number is to be implemented. If I were writing in C, I would implement it as an Unsigned Integer - which is always a positive integer. Which is consistent with the statement: "CAs MUST force the serialNumber to be a non-negative

integer.". thus all 64-bits are available... So, I disagree with Adam Caudill summarisation.

Hence the problem is that the open-source key-generation package, EJBCA, is non-compliant with RFC5820.

But fundamentally, the problem is the imprecise language used in RFC5280...

22 0 Reply
1. Wednesday 13th March 2019 18:57 GMT Jim Mitchell
  
  Re: Confusion due to lax use of terminology in RFC?
  
  I agree, this is confusing, and the article doens't help. 1 is a perfectly valid randomly generated positive AND non-negative 64bit integer, for example. Is that a problem, and why?
  
  5 5 Reply
  1. Wednesday 13th March 2019 19:16 GMT diodesign
    
    Re: Re: Confusion due to lax use of terminology in RFC?
    
    '1' is a perfectly valid cert serial number, yes. There is no problem with it. The problem is that no serial number would be generated with the top bit set, halving the number of available serial numbers and increasing the chance of collision.
    
    C.
    
    11 1 Reply
    1. Friday 15th March 2019 10:45 GMT G2
      
      Re: Confusion due to lax use of terminology in RFC?
      
      obligatory XKCD reference here: https://xkcd.com/221/
      
      =)
      
      3 0 Reply
    2. Friday 15th March 2019 13:20 GMT Loyal Commenter
      
      Re: Confusion due to lax use of terminology in RFC?
      
      To put this into perspective:
      
      Imagine you have a chess board (which convieniently happens to have 64 squares on it). You also have a bag of 64 pawns. Your job is to put a random number of those pawns onto any arrangement of squares on the board and not get the same arrangement as someone else.
      
      This flaw means that you can never put a pawn on the bottom right square. It has no impact on those other 63 squares. Embarassing, yes, but it doesn't exactly 'dramatically reduce' the number of available combinations you have (It halves an already pretty big number, you still have almost 10^19 combinations to play with, which to put it in perspective is roughly the number of human cells in the population of swansea).
      
      3 1 Reply
      1. Friday 15th March 2019 15:48 GMT Anonymous Coward
        
        Re: Huh?
        
        "which to put it in perspective is roughly the number of human cells in the population of swansea".
        
        I want to know *how* you know that specific fact about Swansea... no, come to think of it, not how, but *why*.
        
        Also, AFAIK those are nice people, so what are you planning?!
        
        1 0 Reply
        
        Friday 15th March 2019 16:50 GMT Loyal Commenter
        
        Re: Huh?
        
        I randomly picked a town where the population is about 250k. The number of cells in the human body is 30-40 trillion, and the population of Swansea is apparently 241,300. 2^63 is approx 9.2 x 10^18 so this works out pretty spot on if you take the number of cells in the human body as 38.2 trillion...
        
        So, if you took every single cell of every single human being in Swansea and allocated each a random 63 bit ID, you'd expect a 50/50 chance of getting a previously assigned number when you got to around the 120,000th person.
        
        2 0 Reply
2. Wednesday 13th March 2019 19:15 GMT diodesign
  
  Re: Confusion due to lax use of terminology in RFC?
  
  To be clear, the problem is all about certificate serial numbers, and nothing to do with keys. I've cleared out any mention of keys to avoid any confusion.
  
  The issue is that serial number length must be at least 64-bits and a positive integer. To ensure this, the generation software was keeping the top bit clear, effectively reducing the default 64-bit integer to 63 bits.
  
  C.
  
  11 1 Reply
  1. Wednesday 13th March 2019 23:07 GMT richm
    
    Re: Confusion due to lax use of terminology in RFC?
    
    Unfortunately your comment is still not quite right - the serial number must contain 64 bits of entropy, and be a positive integer. That actually allows a given serial number to be shorter since in some cases the random digits at the start will be 0. This matters because DER (the encoding used in certificates) requires that the shortest encoding be used, so if there are lots of leading zeros they're omitted. The fact that the serial numbers were all 8 bytes long (64 bits) rather than some being 9 bytes is how it's easy to see the issue is present.
    
    As you note, it's in some ways not a big deal at first glance - but then again that's exactly what people said about various timing attacks etc. that subsequently caused all sorts of problems. Also getting the details of the rules right is exactly the CA's job...
    
    8 0 Reply
    1. Wednesday 13th March 2019 23:07 GMT diodesign
      
      "Unfortunately your comment is still not quite right"
      
      Well, I'm trying to keep it simple here in the comments. Thanks for the extra info.
      
      C.
      
      3 1 Reply
  2. Thursday 14th March 2019 11:52 GMT Roland6
    
    Re: Confusion due to lax use of terminology in RFC?
    
    >The issue is that serial number length must be at least 64-bits and a positive integer.
    
    RFC5820 does not mandate a minimum length for the serial number, just a maximum length.
    
    Addendix C gives examples using serial numbers: 17, 18 and 256. the length of the serial number field in the certificates being 8, 8 and 16 bits respectively.
    
    In fact, given what we know now, example C.3 seems to have been given to show that the Serial Number field is signed, as 16 bits are used to encode the 8 bit value.
    
    0 0 Reply
3. Wednesday 13th March 2019 19:23 GMT Si 1
  
  Re: Confusion due to lax use of terminology in RFC?
  
  Yeah, I don’t understand this, why did they use a signed 64-bit integer when negative numbers are never used?
  
  11 0 Reply
  1. Wednesday 13th March 2019 19:31 GMT Tom 7
    
    Re: Confusion due to lax use of terminology in RFC?
    
    If this is a certificate serial number then its going to take a long time to actually run out of numbers. I will have to have a sleepless night to consider reading the RFC but if its like other serial number would it not be partitioned and so the only chance of a collision if some organisation used up all its number range and somehow didnt notice?
    
    4 0 Reply
    1. Thursday 14th March 2019 08:29 GMT Ben Tasker
      
      Re: Confusion due to lax use of terminology in RFC?
      
      No, you're thinking of something more akin to an incrementing serial there.
      
      So you might have
      
      ourcert-1
      
      ourcert-2
      
      ourcert-3
      
      etc.
      
      That's the approach that *used* to be in place. But, it has a number of issues. You cannot guarantee that a situation will never arise where you mistakenly use the same serial twice - for example, if your process crashes mid-issuance, and then another cert is issued, you may have a part-issued cert, and a fully issued cert (for someone else) sharing serial number "3" (or whatever). There are various possibilities in that area.
      
      It also potentially poses an issue if you're distributing your issuance system globally, though that's more easily addressable by inserting a region into the serial.
      
      What the RFC requires is that CAs include a minimum of 64 bits entropy in the serial (some CAs weren't affected by this issue because they were already using more). The serial can be more than just that entropy though, so you might choose to keep your increment and append the entropy to the end
      
      ourcert-1-xx:xx:xx.....
      
      ourcert-2-xx:xx:xx.....
      
      Now, if you have the same issue with a crash (or whatever) you may get two certs allocated "3" but the likelihood of their serials being identical is incredibly small.
      
      > if its like other serial number would it not be partitioned
      
      There's no partitioning no, there aren't ranges allocated out like with (say) Mac Addresses. It's literally the output of a RNG, the issue here is the reduced the namespace by forcing one bit to be a specific value (by discarding all results where that wasn't the case)
      
      5 0 Reply
      1. Friday 15th March 2019 10:01 GMT Anonymous Coward
        
        Re: Confusion due to lax use of terminology in RFC?
        
        "That's the approach that *used* to be in place. But, it has a number of issues. You cannot guarantee that a situation will never arise where you mistakenly use the same serial twice - for example, if your process crashes mid-issuance, and then another cert is issued, you may have a part-issued cert".
        
        Yes you can, but I suppose most coders these days don't actually know how to write distributed systems, even though they all want to use containers and write micro services.
        
        2 0 Reply
4. Thursday 14th March 2019 10:54 GMT mj.jam
  
  Re: Confusion due to lax use of terminology in RFC?
  
  This comes down to the fact that ASN.1 is used in the certificate. The RFC uses an INTEGER type, which is signed. This means anything reading the certificate must treat this as a signed number. The size of the integer can be varied, and the RFC says up to 20 octets. Obviously you could take a 64 bit unsigned, and if the top bit was set, encode this as 9 octets, and if it was unset use fewer (right down to 1 for some very small serial numbers), but clearly they decided to use a signed value.
  
  The IETF certificate on https://tools.ietf.org/html/rfc5280 uses 9 octets for exactly this reason, top octet is 0.
  
  2 0 Reply
Wednesday 13th March 2019 19:24 GMT Yet Another Anonymous coward

Why is this a big deal?

It seems odd that 63bit is a terrible lapse in security allowing certificate serial # to be guessed by every passing script-kiddie for the lulz

While 64bit was a super secure defence against even the most black clad of North Korean cyber-ninjas for the forseeable future.

Seems an odd coincidence that MAX_INT on most systems is exactly the correct key length for optimal security

4 0 Reply
1. Wednesday 13th March 2019 20:01 GMT diodesign
  
  Re: Why is this a big deal?
  
  As we said a few times in the article, it's not a big deal for normal folk. There is still 63 bits of certificate serial number space.
  
  It's just a bit - get it? - embarrassing for the usually by-the-book world of cryptography. And an interesting or amusing bug that we thought Reg readers would appreciate.
  
  C.
  
  15 1 Reply
  1. Wednesday 13th March 2019 20:14 GMT JeffyPoooh
    
    Re: Why is this a big deal?
    
    "...supposedly 64-bit serial numbers in its certificates were in fact one bit short, the top bit being always zero to indicate a positive integer."
    
    They've corrected it, and are now feeling smug. The top bit will now NEVER be zero.
    
    ;-)
    
    20 0 Reply
    1. Friday 15th March 2019 13:24 GMT Loyal Commenter
      
      Re: Why is this a big deal?
      
      They've corrected it, and are now feeling smug. The top bit will now NEVER be zero.
      
      Don't be silly, the value for the bit has to be evenly distributed between ones and zeroes. They've actually decided to look at the phase of the moon and if it's waxing, it's a 1, and if it's waning, it's a 0. When it's a full moon, all certificates turn into wolves.
      
      3 0 Reply
  2. Wednesday 13th March 2019 20:21 GMT Roland6
    
    Re: Why is this a big deal?
    
    >And an interesting or amusing bug that we thought Reg readers would appreciate.
    
    In this context I agree. It also serve as a reminder to those writing specifications: be precise, if necessary be verbose and use widely used conventions: which programming language uses "Positive Integer" as a data type declaration?
    
    Confusion could have been avoided if the authors of the RFC section 4.1.2.2 had simply specified:
    
    1. The serialNumber field as being an unsigned integer of variable length up to 20 octets (ie. 20x8 bits).
    
    2. Removed all reference to negative numbers and non-compliant CAs, because a CA that treats all serialNumbers as Unsigned Int will automatically gracefully handle certificates with serial numbers
    
    that are negative or zero ie. certificates with signed int serialNumbers.
    
    8 0 Reply
    1. Thursday 14th March 2019 11:49 GMT Roland6
      
      Re: Why is this a big deal?
      
      Digging a little deeper, I think a big cause of the problem is that ASN.1 (used in RFC5820 to define the structure of a certificate) seems to only have the data type "Integer", with the exact meaning of that term being "depending on constraints specified " in a specific specification ie. in the text of the specification, in this case RFC5820.
      
      Funny I missed that and obviously have forgotten ASN.1 (not had to use it for nearly 30 years), given the origins of ASN.1 - somewhere in the mists of defining OSI PDUs, it is a little surprising that it is so vague and allows for ambiguous interpretation. Additionally, I have discovered that it is known that this ambiguity has caused problems with ASN.1 decodes over the years, yet no one has seen fit to revise the ASN.1 specification...
      
      1 0 Reply
      1. Friday 15th March 2019 17:08 GMT The First Dave
        
        Re: Why is this a big deal?
        
        So, I'm still confused: Google / Apple et al are apparently all about to revoke existing certificates, but other people say it has no direct effect on security, so why would they bother to revoke?
        
        0 0 Reply
        
        Saturday 16th March 2019 17:38 GMT Roland6
        
        Re: Why is this a big deal?
        
        >So, I'm still confused: Google / Apple et al are apparently all about to revoke existing certificates
        
        It's in the rules of being a a fully fledged cert-issuing CA...
        
        [see Section 4.9.1.1
        
        Obivously, quite a few companies seem to have either used the same software, complete with the default settings or simply assumed that a 64-bit certificate serial number was sufficient to hold the 64 bit output of a CSPRNG...
        
        Mind you, it would not surprise me if GoDaddy finds that using the max possible - certificate serial number of 20 octets (20x8 bits including sign bit), uncovers a large number of non-compliant clients...
        
        .
        
        0 0 Reply
Wednesday 13th March 2019 20:10 GMT JeffyPoooh

Jeffy's Theorem of Binary Digit Distribution

Jeffy's Theorem of Binary Digit Distribution: Approximately 50% of all 64-bit numbers have a leading '1'.

A closely related conjecture, as yet unproven. This may have some bearing on the present issue.

Jeffy's Conjecture of Binary Digit Distribution: Approximately 50% of all 64-bit numbers have a leading '0'.

9 0 Reply
1. Wednesday 13th March 2019 22:51 GMT -tim
  
  Re: Jeffy's Theorem of Binary Digit Distribution
  
  About half of data streams should have a leading 0 but a vast majority of numbers in a computer have a leading 0. When looking at raw data in a computer when doing reverse engineering, pointers will often have their top bits set but not look like negative numbers. Most other numbers have at least their top 4 bytes all zeros. Modern CPUs move around so many 64 bit numbers that are mostly zero bits that the power use is optimized for it.
  
  3 1 Reply
Wednesday 13th March 2019 20:34 GMT Pirate Dave

well

look on the bright side, they now know for sure that they aren't even half-way out of available serial numbers yet.

9 0 Reply
Thursday 14th March 2019 09:00 GMT Anonymous Coward

Really

Were they really relying on a 64(ish)-bit RNG to never produce a collision, or where they also checking that the output was unique too.

Personally I would never sign-off on the former because Murphy's law will kick in at some point. If something has a 1 in a billion chance of going wrong, it will do so 9 times out of 10.

5 0 Reply
1. Thursday 14th March 2019 15:15 GMT Claptrap314
  
  Re: Really
  
  https://en.wikipedia.org/wiki/Birthday_problem
  
  So, it's about one in 4 billion. So we're fine.
  
  1 0 Reply
  1. Thursday 14th March 2019 21:21 GMT Michael Wojcik
    
    Re: Really
    
    So, it's about one in 4 billion.
    
    Right. For an even distribution across a range of N possible values, the collision probability is 1/sqrt(N). The square root of 2^N is 2^N/2, so for a 64-bit value, it's 2³².
    
    If any CA ever got close to the point where a collision was a worry, they'd just start using longer serial numbers. Every two bits added to the serial number length halves the probability of a collision (or, equivalently, doubles the number of certificates you can issue at the same risk of a collision).
    
    Also, for the GP post: Where does it say that anyone used a CPRNG with ~64 bits of entropy to generate their serial numbers?
    
    2 1 Reply
2. Friday 15th March 2019 09:36 GMT Locky
  
  Re: Really
  
  If something has a 1 in a billion chance of going wrong, it will do so 9 times out of 10.
  
  So we need to even the odds up a bit. How about you generate it blindfolded, standing on one leg and with one arm tide behind your back.
  
  Then the odds are about 1 in a billiion....
  
  1 1 Reply
3. Friday 15th March 2019 13:27 GMT Loyal Commenter
  
  Re: Really
  
  Any statisticians here want to work out how many certificates they'd have to issue with (properly) random 64 bit ids before they have a 50% chance of getting a collision with previously issued cert? How about 63 bits?
  
  2 0 Reply
Thursday 14th March 2019 15:25 GMT Claptrap314

Color me unimpressed

We the folks that coded this library first-timers? Had the folks that decided to use this code never heard of code reviews? Or did they fall asleep for the five minutes that their professors covered the esoteric subject of unsigned vs signed integers?

I implemented an IEEE-754 emulator at AMD. (As a mathematician turned programmer.) I learned the hard way how worthless these standards tend to be.as implementation guides.

If you attempt to implement a standard by simply studying what it says & writing your code based on that, expect to embarrassingly fail.

It sounds like the folks who wrote the standard should be drummed out of the industry, along with whomever signed of on using this library. The implementres either reeducated or retired, depending on seniority.

1 2 Reply
1. Friday 29th March 2019 16:45 GMT Claptrap314
  
  Re: Color me unimpressed
  
  Of course, the down voters don't explain themselves.
  
  0 0 Reply
Thursday 14th March 2019 21:56 GMT Bronek Kozicki

Whoahaha

Another proof that Java programmers do not understand unsigned numbers.

2 1 Reply
Thursday 14th March 2019 23:21 GMT Jamie Jones

If I owned a CA hit by this issue...

I'd simply say there was never an issue, and that all my randomly produced serial numbers coincindentally happened to have the top bit unset!

2 0 Reply
1. Friday 15th March 2019 07:14 GMT Richard 12
  
  Re: If I owned a CA hit by this issue...
  
  As a company, you'd be dead within a week.
  
  Crypto relies on a good understanding and implementation of randomness.
  
  Saying that one of the "random" bits was the same in every cert you've issued would be saying that you don't understand random numbers and can't properly implement it, so nobody should trust you.
  
  Much better to admit you've only been using 63 bits of random and not 64, and fix it - after all, the fix is pretty easy if the rest of the implementation is correct.
  
  These serial numbers are public, so it's not like you can actually pretend it didn't happen.
  
  3 0 Reply
  1. Friday 15th March 2019 12:49 GMT Roland6
    
    Re: If I owned a CA hit by this issue...
    
    >Much better to admit you've only been using 63 bits of random and not 64
    
    If you are issuing certificates, as DarkMatter were, that for some reason always contained a 64 bit serial number (because that is what you've configured your CA to do) then because of the ASN.1 DER encoding rules used by RFC5280 you only have room in your certificate serial number field for 63 bits of randomness. Obviously, there is a problem as the current CA/Browser Forum Baseline Requirements require CA's to use at least 64 bits of output from a CSPRNG (crypto random number generator) in the certificate serial number.
    
    Having done a bit of reading around this, it is clear there has been some misunderstanding and corners cut a little too tightly. Basically, for your CA to be able to issue certificate serial numbers containing fully 64 bits of randomness from a CSPRNG, it has to be able to issue certificates with serial numbers with lengths up to 9 octets (72 bits).
    
    >These serial numbers are public, so it's not like you can actually pretend it didn't happen.
    
    Trouble is the evidence seems to be statistical on a very small sample. Whilst from the math it does seem unlikely, it isn't improbable...
    
    From the DarkMatter evidence, I would be more concerned that all 235 certificates logged to Certificate Transparency logs have serial numbers that are in a comparatively narrow range of numbers. Namely, all had numbers that resulted in serial numbers with the most significant bit being set as being between 59 and 63, with the majority having bit 63 set.
    
    2 1 Reply
  2. Saturday 16th March 2019 11:41 GMT Jamie Jones
    
    Re: If I owned a CA hit by this issue...
    
    I guess you missed the smiley face....
    
    0 0 Reply