On factoring of the 4096 bit RSA key

Alternate text

Few days ago, Phuctor announced that they had succeeded in factoring a 4096 bit RSA key. The factor they found for the key was 231.

That's right... 231 = 3 * 7 * 11.

4096 bit keys are supposed to be almost impossible to crack / factor. How did they do it? And what does it mean when keys have such small prime numbers as factors?

If you missed the news, read through these links first...

Reading the first few posts on the above links should tell you that it wasn't a genuine factoring of a valid 4096 bit RSA key. However, even if it was a bogus key, the question now becomes - How did it get on the key servers in the first place?

What went wrong?

There are a few theories around that...

A key generator could have produced a faulty key.
Someone uploaded a fake key (apparently its ridiculously easy to do that).
There was data corruption on the key servers.

Of these possibilities, the first caught my attention... and it also irked me that the first thing that some commentators started questioning is the mathematics and the key generators.

We think properly created RSA keys couldn't possibly have such tiny factors because they were created by sophisticated algorithms, presumably would be two very large primes, and yet... this happens.

Dumb-and-stupid trial division by the first 1000 or so primes wouldn't take much time and could've easily caught this. I see this as a nice precautionary tale that we may sometimes think too highly of sophisticated algorithms that we trust them blindly, and miss looking for the bloody obvious.It's like an "is it plugged in?" sort of thing.

If I deliberately generated a public key that was divisible by 3, I wonder how long it would take for someone to notice... I also entertain the (admittedly very slim) possibility that he did this deliberately to see what would happen.

Its was sad to see how quickly some people, who don't understand the mathematics behind the RSA, or have not seen the algorithm/implementation, are ready to throw in the towel, instead of actually making any attempt to verify the claim.

Anyway... I knew from my cryptography class that any decent prime generation algorithm has a verify stage, which divides the generated prime candidate by a fairly large number of small prime numbers, which is then followed by a set of Rabin-Miller tests.

A key with these factors could not have been generated by any prime generation algorithms that I had studied in my university days. Of course, the current implementation of the prime generation algorithms could be quite different. So, I decided to check one of them out - the most famous and often used one.

Looking through GPG and Libgcrypt source

I downloaded the source code for libgcrypt (used by GPG) from GNUPG website and had a look at the gen_prime method in cipher/primegen.c. The implementation of the prime generation code is not all that different from what I had studied... but there were some pretty interesting lines of code in there. Here are some highlights...

The prime generation code only supports generating primes larger than 16 bits. Any primes below 65536 are beneath this algorithm.
The algorithm begins by randomly generating an odd number of the requested bit length (with 1 as the highest and the lowest bits). This guarantees the minimum size of the prime as well as tests for divisibility by the smallest prime - 2.
It then divides the number by all primes upto 4999 and saves the remainder from all of them for future use. This answers one of the points raised by the comment quoted above... the algorithm does check the prime against the first 669 prime numbers.
If the number turns out to be composite, the number is incremented by 2 and algorithm starts again. The division process is speeded up by a smart way of updating the remainders generated in the last step.
If the number is not divisible by any of the small prime numbers, then the number is checked against a Fermat test (Wikipedia: Fermat primality test), followed by 5 rounds of Rabin-Miller tests.
The method is_prime which conducts the Rabin-Miller tests uses the goto statement in a couple of places. Actually, the whole file / codebase seems to use goto statements quite liberally. This is nothing bad in itself, but this gives a hint about when this code was written. No respectable programmer, in the last 20 years, should be using goto like constructs, unless coding in assembly.

Also, the 5 Rabin-Miller tests in point 5, do seem like a low number of tests. The 5 tests gives a probability for the number being composite as \(\frac{1}{1024} = 2^{-10}\). This implies, in the most naive interpretation, that 1 in every thousand numbers generated by the algorithm could be a composite. However, even then, that composite number cannot be divisible by any prime smaller than 5000. So, even though the algorithm does not actually guarantee that the generated number is a prime, but it definitely cannot have factors of 3, 7 or 11.

Many other specialized methods in the same file, which I believe are for use by large RSA key generators (comments in the file strongly suggest this but I have not verified it), use 64 rounds of Rabin-Miller, which gives the probability of the number being composite as \(2^{-128}\). This is a much better probability of the number being prime.

How much better?

The expected number of primes that you would have to generate before you get a composite number is \(2^{128} > 10^{38}\). That's a really huge number of primes... and even then, the generated number won't have small prime divisors.

RSA with composite keys?

Reading the last few paragraphs, its natural to wonder - what happens if the primes used in RSA are not really primes. After all, the algo under discussion does not 100% accurately guarantee that the 2048 bit prime that it generated is indeed prime.

Would RSA still work or will there be problems with encryption or decryption?

I don't remember my cryptography class well enough right now to answer this question mathematically...maybe we can come back to this topic some time later.

In conclusion... libgcrypt (and GPG) does check for divisibility by small primes, followed by 5 (or 64) rounds of Rabin-Miller. The faulty primes/keys could not have been generated by libgcrypt (GPG). The problem lies with another program, and/or hardware and/or person.

Update From various newer posts on reddit and hacker news, it seems that the compromised keys are completely fake. These keys were probably intentionally inserted into the key servers as sub-keys to master keys. However, these sub-keys had not been signed with the private keys, so they could not have fooled pgp software .

Who did it and why is anyone's guess.

Update 2 It seems that the RSA encryption - decryption test with possibly composite p or q is essentially a fermat test for the primality of p and q. Hence, it is possible that the RSA encryption - decryption might work even with composite p and q. However, its not a sure sign that the number is prime.

Comments

On factoring of the 4096 bit RSA key

What went wrong?

Looking through GPG and Libgcrypt source

RSA with composite keys?

Comments

Published

Category

Tags

Contact