CAs Reissue Over One Million Weak Certificates

Turns out that the software a bunch of CAs used to generate public-key certificates was flawed: they created random serial numbers with only 63 bits instead of the required 64. That may not seem like a big deal to the layman, but that one bit change means that the serial numbers only have half the required entropy. This really isn't a security problem; the serial numbers are to protect against attacks that involve weak hash functions, and we don't allow those weak hash functions anymore. Still, it's a good thing that the CAs are reissuing the certificates. The point of a standard is that it's to be followed.

Posted on March 18, 2019 at 6:23 AM • 26 Comments

Comments

Andre de AmorimMarch 18, 2019 7:31 AM

Software development cycle on my view;

1. Specifications (what the software meant to do)
2. Patterns recognition (Maths)
3. Algorithms (set of rules)
4. Code (descriptive language)

then check if the code met the specifications.

Regards to the CA and '1M', btc started with '20M'

;-)

TatütataMarch 18, 2019 9:14 AM

Following the links, I have the impression that the bug really lies with the underlying specification:

The serial number MUST be a positive integer assigned by the CA to each certificate. It MUST be unique for each certificate issued by a given CA (i.e., the issuer name and serial number identify a unique certificate). CAs MUST force the serialNumber to be a non-negative integer.

Why didn't the specification merely state that the SN must be an unsigned 64 bit number? Or why should 2's-complement representation have a meaning in this context?

This problem wasn't exactly discovered yesterday. Looking up the above string, I come across a short thread from 2014 titled Why does the RFC 5280 dictate the CAs not to issue certificates with negative serial numbers?!

In the 90's Cleve Moler wrote a TMW technical note describing the uniform RNG used in Matlab 5. I remember having a problem in accepting the periodicity claimed for this RNG, as IIRC, it appeared to be related to the sum of mantissa width in bits of the floating point state vector. However, since the vector is in IEEE-754 double precision floating point representation, with an implied-1 leading bit, the situation is more complicated. But I assume that this RNG wasn't designed for crypto. For simulation, non-uniform distributions are of a greater interest, e.g., the normal distribution which can be generated with Marsaglia's Ziggurat algorithm.

FaustusMarch 18, 2019 9:18 AM

"63 bits instead of the required 64. That may not seem like a big deal to the layman, but that one bit change means that the serial numbers ONLY HAVE HALF the required entropy."

No, the serial numbers will have 63 bits of entropy rather than 64. That is not half the entropy. Entropy is a logarithmic measure of the number of possible states, not the actual number of possible states.

It is true however, that reducing the bits by one reduces the number of potential values of the serial number by half, but the entropy is the log base 2 of the number of potential values, and so it is decreased by 1.

scotMarch 18, 2019 9:42 AM

This is a pedantic grammar issue, but isn't "random serial number" self-contradictory? A "serial number" should, by its definition, be sequential, and by conventional usage, also unique. The term "random" prohibits predictability, and thus both the sequential and unique aspects of serial numbers. Poorly written documentation is bad enough, but poorly written specifications are the subject of nightmares.

MKMarch 18, 2019 11:25 AM

5280 does say: "Note: Non-conforming CAs may issue certificates with serial numbers that are negative or zero. Certificate users SHOULD be prepared to gracefully handle such certificates."

MKMarch 18, 2019 11:28 AM

Additionally: "Given the uniqueness requirements above, serial numbers can be expected to contain long integers. Certificate users MUST be able to handle serialNumber values up to 20 octets. Conforming CAs MUST NOT use serialNumber values longer than 20 octets." So nothing magic about 64 bits.

ArclightMarch 18, 2019 4:16 PM

I'm a bit less forgiving of CAs. This is one of those "You had ONE job" areas where literally they are taking in a bunch of money to issue sets of random numbers and signed text files. I feel like we should expect a lot more due diligence from them than other concerns whose main business is something else, like writing software.

MKMarch 18, 2019 4:28 PM

The number of non-conforming CAs is mind boggling. I think the problem originates with the name Certificate *Authority*. They think they are the authority on certificate formats. One large Chinese CA insists on using BER formatting rather than DER. A German CA I am familiar with says it's OK to sign a CRL with a certificate that is not on the issuing path. I was told it's OK because they have a letter from the German government telling them that it's OK. Of course, we always treated "INTEGER" as unsigned in the case of serial numbers. Getting back to the original complaint of weak certificates -- serial numbers just need to be unique. They play no part in certificate strength. Look to the hash function and RSA modulus for strength. I've seen MD5/RSA512 in the wild.

MKMarch 18, 2019 5:11 PM

I've looked at the CA/Browser Forum document and (IMHO) the requirement for(at least) 64 bits of CSPRNG is just dumb. One reason for not issuing sequential serial numbers is to keep the competition from knowing just how many certificates you have issued. But now you need to keep a database of all the serial numbers you have issued to ensure that they are unique (that requirement from 5280 was not changed). Isn't there any adult supervision of these standards?

woodyMarch 18, 2019 5:13 PM

The largest CRL's I know about (no particular search done) is about a half million entries. Is it one million per CRL, or one million over all CRLs?

1&1~=UmmMarch 18, 2019 5:40 PM

@Scot:

"A "serial number" should, by its definition, be sequential, and by conventional usage, also unique"

Whilst your second premise is true, the first is not, not even by convention (think bank notes).

In the main it happened because it was simpler thus less expensive to make them sequential.

If you look at Physically Unclonable Functions (PUFs) they are assumed to be both random and of such a size / complexity that the probability of them not being unique based on the first assumption is below a certain threshold to make a clash highly improbable.

DaveMarch 18, 2019 8:31 PM

It's more complicated than that, but you'd need to read the incredibly long and mindlessly inane thread that covers this. The serial numbers aren't weak, strong, or anything else. The figure of 64 bits was pulled out of thin air. There is no cryptographic argument of any kind to support it, it was just an arbitrary value that someone invented. The issue is that the CAB Forum baseline requirements say you have to use 64 bits, and by some interpretations of the requirements some CAs may not have done this - the wording is extremely vague and open to many, many different interpretations.

So the issue is that some CAs didn't quite demonstrate sufficient compliance with an arbitrary and vaguely-worded requirement in the CAB baseline requirements, not that there's any kind of security issue.

1&1~=UmmMarch 19, 2019 6:29 AM

@Bruce Schneier:

"This really isn't a security problem; the serial numbers are to protect against attacks that involve weak hash functions"

Are you saying that what was once seen as, just a 'usefull additional field' in a specification, later became co-opted/promoted to use as effectively a 'crypto nonce' to ameliorate a security deficiency in the primary standards?

In essence a chance opportunity to avoid backwards compatability and other legacy issues that would arise from re-issuing an existing standard. That in turn arguably arose because of a lack of foresight or knowledge.

If so it might be worth pointing out that was what it was, in the same way the original WiFi specification using what became known by researchers as ARC4 that was used for the quickly broken 'Wired Eequivalent Privacy' (WEP). Again caused by a lack of knowledge.

If the 'General Developer' population is not told that the specification was at fault and the use of the serial number being promoted to a 'security function' in what should have been just a very short term tempory fix, then we might not have seen it go horribly wrong as it has done. That is effectively important 'information was withheld' from the community thus they did what they thought was pragmatic within the scope of their knowledge and understanding.

Without thw information being correctly provided to them how can we expect the general developer to not just make the correct changes, but also develop a 'sixth sense' knowledge about security and thus treat things with caution?

Which in turn will make programmers more likely to develop their security related knowledge and avoid making mistakes at the time, and also in the future when similar circumstances that almost certainly will re-occur.

Look on it as being constructive in their "Live to learn" in their life long learning and development journy.

@Dave:

"The serial numbers aren't weak, strong, or anything else. The figure of 64 bits was pulled out of thin air."

No it was not pulled out of thin air, it is most likely a compromise based on historic knowledge and issues in part to do with the implementations of integer mathmatics on computers going back to the 1950's if not further.

That is back well befor most modern programming languages were even thought of, and computers were very very expensive being thousands of dollars per bit of bus width and the associated ALU arethmetic --not logical-- functions.

Worse historically ALUs thus the bus widths had little or no commonality of 'bus widths' or size multipliers. Many early systems used multiples of '3 bit octets' whilst other later systems used '4 bit nibbles', some used both at convenient points. The reasoning was to do with the convertion of integer numbers in binary back to easy to handle for humans approximation of decimal numbers, with some early CPUs actually having addition and subtraction instructions to deal specifically with 'Binary Coded Decimal' (BCD) number representation. You can still see the legacy of the 3/4 bit divide to this day not just in hardware such as the early MicroChip PIC processors (12bits) and various IBM and similar 'Big Iron' systems (24 and 36 bits). But also in the permission bits of *nix operating systems, the history of which goes back atleast as far as having compatability with the now archaic attributes of the 'General Comprehensive Operating System' (GECOS) Operating system. Which came from GE's Computer division that made 'Big Iron' mainframe computers back in the early 1960's, used by the military and their supporting research institutions including a few Universities who were 'Gifted or Subsidized' GECOS systems as part of a marketing plan to 'capture the market'.

But also causing significant issues was 'signed integer' mathmatics, some computers used 'One's Complement' maths which had the issue of 'two zeros' and 'Two's complement' that has the issue of 'unbalanced number ranges' for negitive and positive integers that cause headaches aplenty for those writing 'extended maths functions'. Signed integers both types of which efectively have implicit most significant sign bits which causes another issue with 'number ranges' in effect halving the number range compared to unsigned integers. Unsigned integers are all that base computing hardware actually works with even in maths co-processors. The concept of negative numbers is not implicit in most ALUs and address bus calculations hence issues with IAx86 'poor mans MMU' segmenting.

Historically 'signed' is a 'bolted on after thought' to make life simpler for programers not just with basic maths but more importantly 'branching' program flow.

Thus 'C' developed in the late 1960's from 'B' and Algol, and had to support both bus width multipliers efficiently to 'become usefull'. Which caused an 'integer math' issue the legacy of which we still see today, because later languages were often written in 'C'. Put simply integers are handled very badly in 'C' in that a 'lowest common denominator' approach was taken. If you want to make your own 'larger integers' which used to be done on 8 / 12 / 16 bit CPU architectures to do 32 and 64 bit integer maths you had to write convoluted code as there was no access to 'CPU flags' to easily work with 'overflows and underflows' for addition and subtraction / comparison that then caused other issues with multiplication and division (don't ask about floating point that is another add on programmer convenience that adds even more issues).

So the '64 bit unsigned' was very much not "pulled out of thin air" it was at the very least a compromise between 'in language available' programmer convenience and trying to solve a security failing in a standard. All without having to reissue the base standard which would have given rise to all sorts of backwards compatability issues, or legacy issues with trying to get code patches released. But primarily all without having to have programmers steping out of their normal working zone or worse and opening up a fresh batch of security vulnerabilities, so creating yet further patches etc.

DaveMarch 19, 2019 8:51 AM

@1&1~=Umm: Read the discussion thread I referenced. Some guy went through all the research papers related to it and couldn't find any evidence for the value anywhere. It was literally made up for the CAB forum documents.

FaustusMarch 19, 2019 10:23 AM

Bruce: "63 bits instead of the required 64. That may not seem like a big deal to the layman, but that one bit change means that the serial numbers ONLY HAVE HALF the required entropy."

Faustus: No, the serial numbers will have 63 bits of entropy rather than 64. That is not half the entropy. Entropy is a logarithmic measure of the number of possible states, not the actual number of possible states.

Time has passed. Nobody contradicts my logic. But still no correction from Bruce.

I look at this blog as a proving ground of a lot of what we say. This situation makes me think of the public policy technologist discussion. Because Bruce is a public technologist and I think he just made a significant error that would mislead anybody using his statements to evaluate entropy. Not to dump on Bruce. I think it shows a flaw in the entire concept of expert technologist.

There remains the chance that I am somehow wrong in my statement, but that seems unlikely. Am I so annoying that no one would bother correcting me? I hope not. I am one of the rare birds who will happily acknowledge a mistake.

Assuming I am correct these are the things I think this error and lack of reaction communicate:

1. Being an public expert is inimical to being a technologist. If you are not in the trenches every day with technology your knowledge quickly becomes stale. Bruce wrote important ciphers in the 90s. At some point he clearly understood what entropy is. But now that he is an expert and a public figure this information appears to not be immediately accessible to him because he mind is no longer focused on tech, but rather social issues.

2. As an expert, one may be disinclined to admit that one has been mistaken because it harms one brand. Or perhaps one might downplay the error. "Logarithm smogarithm! You know what I mean!". But people who come to you for expertise probably are unable to determine what is literally true and what is "close enough to push under the carpet".

3. I'm general I think experts are much more heavily incentivised to appear expert-like than to actually be correct. If an expert won't engage with a sincere concern about the correctness of their statements, an expert just becomes another flavor of politician.

4. If public interest expertise is to be a real helpful thing at a minimum experts must be scrupulous with the truth, fast with self correction, and be just as willing to communicate information that does not support their policy positions as information that does. Otherwise it's just more of the current post truth bs.

Again, I respect Bruce. I think all I say applies to all the experts of this era. Bruce just provides a powerful specific example of why I am not comfortable with the whole idea of "expert" wherever it diverges from "practioner".

MKMarch 19, 2019 10:48 AM

@1&1~=Umm: The serial number in a certificate plays no part in the cryptographic security of the certificate. It is just a number to be used in identifying a particular certificate, and disambiguating certificates with the same Issuer Name for the purpose of identifying a particular certificate in a CRL. I have to agree with @Dave that the serial number requirement in BAB was based on ignorance.

FaustusMarch 19, 2019 12:13 PM

@ To all who think the 63 vs 64 bit issue is a nit

I think the most important issue here is non conformance to specs. If people don't follow specs because they don't see the need it sets a bad precedent. Not seeing a need is far from a security proof. Eventually someone will not to see the need for.something that is really important.

WeatherMarch 19, 2019 2:23 PM

A 1bit from 2bit isn't much but 63 from 64 is, maybe you should read two, they seem linear, but the problems arise from a=a but the next a=a is 1.67 times plus that, t
The comp you could transfer a HDD but 100 years go by.

1&1~=UmmMarch 20, 2019 1:09 AM

@Dave:

"Read the discussion thread I referenced."

Sorry to sound picky but you 'mentioned' a thread, but did not actually 'reference' it, and you provided no way to uniquely search for it...

But to get back to the point the issue has been discussed before and people still don't grok the problem. So,

A) The CA Baseline Requirements in Section 7.1 says the output 'must be at least 64 bits of randomness'.

B) In RFC 5280 they only half helpfully says the top bit must be zero which is actually ambiguous when the integer is in a larger field.

Despite the above the EJBCA has a total failure to understand, A&B above correctly and defaulted to using 63-bits + zero - which guarantees non-compliant certs. Importantly though, despite it being known, nobody cared to act on it even though it had been pointed out by others including Peter Gutman on a number of occasions. That is untill 'DarkMatter' the big bad wolf came along. Who everyone is told, they should hate, despise and cast out because they are 'the bad guys'... So when a certain well known organisation opens the door and says 'Hi, welcome have a seat at the table', people start to make rumblings and look for reasons to 'cast out'. So the self appointed 'good guys' needed an excuse, and predictably did their 'brain in neutral' thing, and took the lid of the can and low and behold all sorts of worms wriggled out into the light of day... Opps.

Thus illustrating the problem of a rules based ecosystem, where by design, the only rules are definable and testable as they should be. And because nobody can codify 'Don't be evil' they can not make a discriminating and testable rule for it...

But going back a bit as I noted the issue between A&B was spotted and discussed before the spec went out, and nobody thought it was going to realy be an issue thus looked the other way.

So what should the EJBCA done?

Well it's not entirely clear at first sight because most are not experienced engineers designing communications protocols for a living. Rule A says that 64 bits of any valid integer number must be from the binary set where each bit can be a zero or a one. Thus none of those 64 bits can be a sign bit. And rule B says the top bit of any integer must be a zero, Which whilst correct is only half the problem. Because the size of the integer is not defined, but the size of the field it is in has to be a minimum of X bytes which is way more bits than 64.

So the issue of finding the size of an integer in an unknow but sufficiet field size comes into play. From comunications theory the simple way is to have a known lead-in marker. Usually because of noise and synching issues it will be a multibit pattern like 01111110 that is a unique inband marker that can only be used for that purpose (see HDLC comms protocol* for a full explanation going back into the 1970's at least). How ever in the case of a reliable data field you can do things the way Alan Turing did on his tape. You simply start at the left and go down bit by bit untill you meet a marker which I'll call the 'start bit' this marks the start of the integer. In this case that would be a 'one' as all the preceading bits in the field are by convention zeros. Also, as by convention, signed integers start with the sign bit which is zero for positive numbers and one for negative numbers, simplicity and comms security say that the start bit can not also be used as a data bit so the next bit must by rule B be zero. So your lead-in would be, '...010' in front of the 64bits of entropy and the lead zero of the lead-in can be zero or more occurances of zero by convention.

So the actuall minimum size for the integer field would be 66 bits but in a larger field size it would be 66bits preceded by all lead bits being zero.

But modern computers usually have as a minimum, data widths based on multiples of eight bits. So if EJBCA had been a bit more helpfully and said either,

1, Eight bytes of random bits prefaced by a fixed start byte of 0x02 right justified in the field with all bytes preceading that being 0x00.

2, Nine bytes right justified in the field, the first byte of which the two Most significant bits are '10' followed by six random bits followed by eight bytes of eight random bits.

There would be no issues. But hidden away is the fact that despite what is implied ASN.1 is ambiguous and ASN.1 is also the chosen standard to define the likes of communications data object structures. Further is the caution of proffessional engineers and cryptographas, as Peter Gutmann said in Jan 2013,

"'There's a second but: Historically many encoders have gotten the signedness of integers wrong, which means that (a) if you get a negative number (at least in the area of crypto, which I'm most familiar with) it's always an encoding error and never a deliberate use of a negative value, and (b) because of the widespread use of incorrect encoders, many decoders treat all integer values as unsigned. So while you can use negative values in theory, it's not a good idea in practice.'"

http://lists.asn1.org/pipermail/asn1/2013-January/000959.html

* For an understanding of HDLC and it's oragins see, https://en.m.wikipedia.org/wiki/High-Level_Data_Link_Control

Jenny DMarch 20, 2019 2:36 AM

There's one thing here that is consistently misstated: This is not a software issue, it's a configuration issue.

The default config, since 2001, was to have 64 bit serial numbers. But the EJBCA software has supported longer serial number since April of 2014 - see the release notes for the EJBCA open source version 6.1.1. But as a rule, upgrading EJBCA won't overwrite your configuration - nor should it.

CABforum's requirement of 64 bits of entropy came in 2017.

EJBCA changed its default configuration in february 2019, i.e. well before the story broke on Ars Technica. But, again, changing the default config for fresh installations will not overwrite any currently used config.

At any point when there's any change to the requirements a CA purports to follow, it's incumbent on the administrators to look through their configuration and make any required changes. If they don't do that, the error does not lie with the software.

Should the default config have been changed earlier? You could certainly make a case for that. But the way I see it, the main issue here is that the CA admins did not verify that the running configuration was correct for their CA when the requirements changed.

1&1~=UmmMarch 20, 2019 7:09 AM

@Faustus:

"To all who think the 63 vs 64 bit issue is a nit"

Do you want to take a second run at that?

Because I realy think it does not say what you thought it did at the time you wrote it.

As for if 64 bits is not 63 is important or not, it depends on how you view it.

Each additional bit doubles the brut force search time, I think we can all agree on that, even if we know there are other methods than brut force.

Which brings us into the question of 'safety margins' how do you decide at any point in time what is sufficient at that point and remain so for say a quater of a century on from that point.

The simple answer is you can not, nor could not in the past either. History teaches us two relavant things, firstly the development of man himself is incredibly slow, but mankinds development of tools is happening at an unpredictable but increasing rate. Many acting as the equivalent of force multipliers.

The laws of physics however tells us there is actually a brick wall we are heading towards at that ever increasing rate. In theory there is no way around that wall but man is generaly inventive but unpredictability so. As for computer logic we are already at the point where the raw performance of our electronics is within a few percent of that brick wall for sequential processing. So any break through there is going to have a minimal impact. However we also know that sequential processing is not the only game in town. That is we can do things in parallel in a lot of cases, and it just so happens that breaking weak hashes is one of them. Again the fundemental rules of physics apply as there is only so much that we can do in terms of raw materials.

But as we have found, there are other laws of physics that are strange indeed beyond what we can usefully understand from our comparative macro world experience of them. Thus at any moment quantum computing could appear, or maybe not, it's an unkown which appears to be in general just out of reach currently. On the assumption it happens people will get to understand it practically and in the process be able to see new methods and potential challenges ahead. The point being not only do we don't know what they will be or if they will be worth exploiting or not, we also have absolutly no idea of when they might happen.

Thus the only thing we can say is that safty margins are just a guesstimate based on gut feelings, none of which is realy quantifiable in any way.

So it' probably best, to 'over egg the pudding' compared to the specification minima, to get the security margins rather more on the safe side. Or is it? Because if you look back at 'clipper' of Crypto-Wars One, it's safety margine over that of it's specified bit size was actually very small in deed of a matter of bits.

Thus in the case of Clipper the loss of just one or two bits would have wiped out it's security margins. Even without a known security flaw I doubt very many people would bow consider it secure. As forcother NSA crypto they have tried to get public acceptence of it appears nobody wants it and actively fights against it.

Oh and of course we had found out that the NSA had found ways to forge PubKey Certs with weak hashes, at that time we did not know how, and we may possibly never know how they came to do it. But the open community knowing it was possible then found it's own ways fairly quickly, that they might possibly not have done for quite a while had they not known it was real world possible.

But hopefully there has now been enough 'noise' over this for the information to get out there and become knowledge in developers heads, so hopefully history won't repeate in the same rut, because security needs to keep moving forward and not spin it's wheels or slide back as it has in some cases where known knowledge appears to have been forgoton over as little as a couple of decades.

MKMarch 20, 2019 10:30 AM

@1&1~=Umm:

"To all who think the 63 vs 64 bit issue is a nit"

In this case, it is a nit, because the serial number plays no part in the cryptographic security of the certificate. It is just one of the items that is hashed in the certificate signature.

I would think that the least disruptive thing to do would be to issue all new certificates with longer serial numbers, replacing the others at expiration.

1&1~=UmmMarch 21, 2019 10:14 AM

@MK:

"the serial number plays no part in the cryptographic security of the certificate."

Interesting, yet others believe that it makes the faking of a crtificate that used weak hashes less easy... Would that not be something that effected the Cryptographic Security...

Who to believe that is the question...

SzMarch 22, 2019 12:41 PM

That may not seem like a big deal to the layman, but that one bit change means that the serial numbers only have half the required entropy

That should be, technically, one half the "event space."

Still supposedly 63/64 the "required" entropy, because entropy is a logarithmic measure, but I don't quite believe that, either. The gentlemen and political potentates play awfully dumb with the math sometimes, but in a certain way that cannot be corrected by "getting smart" with them.

Chris BeckeApril 18, 2019 1:51 AM

Moores law means the effective strength of any hash or key halves every 18 months anyway. So every 18 months sees a virtual 1 bit bitrot of all existing keys.

Leave a comment

Allowed HTML: <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre>

Sidebar photo of Bruce Schneier by Joe MacInnis.

Schneier on Security is a personal website. Opinions expressed are not necessarily those of IBM Security.