RSA-240 Factored

This just in:

We are pleased to announce the factorization of RSA-240, from RSA’s challenge list, and the computation of a discrete logarithm of the same size (795 bits):

RSA-240 = 12462036678171878406583504460810659043482037465167880575481878888328 966680118821085503603957027250874750986476843845862105486553797025393057189121 768431828636284694840530161441643046806687569941524699318570418303051254959437 1372159029236099 = 509435952285839914555051023580843714132648382024111473186660296521821206469746 700620316443478873837606252372049619334517 * 244624208838318150567813139024002896653802092578931401452041221336558477095178 155258218897735030590669041302045908071447

[…]

The previous records were RSA-768 (768 bits) in December 2009 [2], and a 768-bit prime discrete logarithm in June 2016 [3].

It is the first time that two records for integer factorization and discrete logarithm are broken together, moreover with the same hardware and software.

Both computations were performed with the Number Field Sieve algorithm, using the open-source CADO-NFS software [4].

The sum of the computation time for both records is roughly 4000 core-years, using Intel Xeon Gold 6130 CPUs as a reference (2.1GHz). A rough breakdown of the time spent in the main computation steps is as follows.

RSA-240 sieving: 800 physical core-years
RSA-240 matrix: 100 physical core-years
DLP-240 sieving: 2400 physical core-years
DLP-240 matrix: 700 physical core-years

The computation times above are well below the time that was spent with the previous 768-bit records. To measure how much of this can be attributed to Moore’s law, we ran our software on machines that are identical to those cited in the 768-bit DLP computation [3], and reach the conclusion that sieving for our new record size on these old machines would have taken 25% less time than the reported sieving time of the 768-bit DLP computation.

EDITED TO ADD (12/4): News article. Dan Goodin points out that the speed improvements were more due to improvements in the algorithms than from Moore’s Law.

Posted on December 3, 2019 at 2:12 PM46 Comments

Comments

Clive Robinson December 3, 2019 3:04 PM

@ Bruce,

So RSA-240 has been broken and another “squeamish osifrage” (or how ever you spell it 😉 message revealed.

Also not done by “Crown Sterling” and their magic crystal machine.

That as they say “Are the facts”…

So how about some interpretation and broader scope analysis?

willmore December 3, 2019 4:12 PM

I wonder if it’s time to redo my “how hard would it be to crack RSA-129 with current hardware and software” test. Hardware and software keep getting better every year so it gets easier and easier.

Oh, shoot, I last did it in 2009 using an AMD Athlon64 X2 5600+. Took about a week or so. Damn, I bet it takes less than a day with current hardware/software! Also, I missed the 25th anniversary this year! Well, I guess I’ll have to do it next year in April for the 26th anniversary.

Ross Snider December 3, 2019 5:52 PM

Not included in this excerpt is the analysis that algorithm improvements accounted for an overall 3x improvement over the expected runtime of the factoring and DLP effort.

It’s a lot harder to account for algorithm and software improvements in hard numbers, but I suspect that over time they either compete with, or possibly even outstrip Moore’s law in terms of the total effectiveness of computation.

Weather December 3, 2019 7:29 PM

Hi I’m trying to workout how you can even crack RSA-240 , could I please have a code segment of the important bits. Sorry English isn’t my second lauwage

NFS in the RSA December 3, 2019 7:44 PM

With apologies to Johnny Burnette

To the tune of Dreamin’

https://m.youtube.com/watch?v=DYM-kT7HS80

Sieving, I’m always sieving
Sieving, looking for primes
Coding, I’m always coding,
Hopin’ someday I’ll find
Primes, primes that are factors
Primes that decrypt things but until then
Well, I’ll keep on sieving
Keep right on sieving
Sieving, till my sieving comes true

Etc.

Wael December 3, 2019 11:42 PM

@NFS in the RSA,

To the tune of Dreamin’

And to the tune of Sailing, too

Well, it’s not far away from wildrice, at least it’s not for me
And if the sind is right you can Sieve away and find hostility
Oh, the planless can be heracles, just you wait and see.
Naive me.

It’s not far to never-never land, no reason to pretend
And if the wind is right you can find the joy of innocence again
Oh, the canvas can do miracles, just you wait and see.
Believe me.

Sailing takes me away to where I’ve always heard it could be
Just a dream and the wind to carry me
And soon I will be free
Factoring, it gets the best of me
When I’m Sieving

All caught up in the reverie, every turd is a Sunway
Won’t you believe me?
NSA takes me away to where I’ve always heard it could be
Just a theme and the bind to harry me
And soon it’ll be scree

Well RSA not far from vanity, at least two hundred and forty
And if the sind is right use Milkyway and find entropy
Oh, the campus can do miracles, just you sate and Sieve.
Peeve me.

Sieving takes me away to where I’ve always heard it could be
Just a dream and the wind to wary me
And soon I will flee

Curious December 4, 2019 4:50 AM

Is there perhaps a best practise rule in cryptography or for anyone relying on the use of cryptographic solutions, that sets of prime numbers must always be chosen randomly as opposed to using numbers from a list or provided to them from some other source?

It would be bad if anyone used a set of prime numbers off somebody elses list, right? (Because then an adversary could just collect known prime numbers and try crack stuff using prime numbers that somehow other people used or generated at some prior point.)

I am also wondering if some prime numbers are worse than others. Something in the back of my head vaguely reminding me of something I might or might not have seen in a youtube video at some point.

And then I guess, there would be the security aspect to working with crypto, in not allowing any potential adversary to somehow record or spy on you selecting or using your set of prime numbers for something like public key cryptography.

Curious December 4, 2019 4:57 AM

To add to what I wrote:

I guess I could imagine there being a simple but huge list of prime numbers, or some common software even for generating them, but, seems to me that using such a software would be risky, as risky as storing all your passwords in a password manager in a browser (how anyone could trust something like that is beyond me, seems very bad). Becuase, simply trusting the computer or its software for replacing a security regime, seems overly risky if also doing something remotely important.

willmore December 4, 2019 5:20 AM

Okay, so I grabbed the CADO-NFS code (2.3.0) and gave it a try. Hardware is a bit better than last time. This time it’s a dual Xeon X5660 w/24GiB of memory. Time:
Info:Complete Factorization: Total cpu/elapsed time for entire factorization: 546444/26380.8

That’s 7 hours, 20 minutes wall clock time or 6d 7h 48m total CPU time.

I wonder if I can dig out that old Athlon system to get a software only improvement value…

wiredog December 4, 2019 6:10 AM

@Curious
There are lots of lists of primes, heck, they used to be published in books. Here’s one list.

As long as you don’t select your primes, or pairs of primes, sequentially you’re probably OK.

Enviralmist December 4, 2019 6:28 AM

Am I the only one marveling at the energy bill this project incurred?
There’s sure worse, crypto-mining, using java, or unity, (snip), but beating on a dead horse like this, what was the point?

That is to say, I’ll second Clive Robinson, am quite interested in a broader scope analysis and interpretation.

Security Sam December 4, 2019 8:12 AM

Those who believe in digital privacy
Go though daily life just dreaming
And when they’re victims of piracy
They react predictably by screaming.

Clive Robinson December 4, 2019 8:20 AM

@ Curious,

Is there perhaps a best practise rule in cryptography or for anyone relying on the use of cryptographic solutions, that sets of prime numbers must always be chosen randomly as opposed to using numbers from a list or provided to them from some other source?

Yes there is and unfortunately it is broken way way more often than it should be as I’m sure the NSA, GCHQ, et al know and advantageously exploit.

The first thing you have to realise is that what you call a “list” has a broader meaning than something printed in a book or put in a computer file, it includes problematic Random Number/bit Generators (RNG) as well.

As you’ve no doubt heard anyone using a fully determanistic system for generating random number sequences has for some time been considered to be “living in a state of sin”[1]. But at the end of the day computers are –or should be– fully determanistic systems. So important is the point about what some call “True Random Number Generators” (TRNGs) that Alan Turing put the design for such circuits into the fundemental design of computers[2] though few ever did untill this century.

The problem is “true entropy” is a very very rare beast, most TRNG sources are heavily contaminated not just with “faux entropy” but various types of bias as well. Cleaning up and extracting the true entropy is a complex and involved process so most people don’t bother doing so. Instead they use some kind of Computer Secure (CS-) algorithm as “magic pixie dust” to mix things up and make them look sufficiently unbiased to meet the original Diehard Tests[3] and subsequent test suites.

Unfortunatly most CS-algorithms although fully determanistic also pass these same test suites which gives you an indication of where problems arise.

Our host @Bruce has designed a number of RNG sources such as Yarro, importantly part of the design is an “entropy pool” which in effect captures and holds the very very rare bits of True Entropy and by a mixing process spread them across all bits in the pool.

However there is a problem in that it takes a very long time to build up sufficient true entropy in the pool. And unless the pool is saved at “power down” and reloaded at “power up” you will have to wait that period of time each time a computer is powered up.

Whilst this is done in OS’s like BSD and some Linux varients, it’s usually not done in “embedded systems” used for communications and security.

Worse even when it is done, the initial power up after a factory reset starts with an empty pool, and almost immediately goes into the process of generating random numbers for security purposes such as for crypto key generation (KeyGen). This of course includes those used for “finding primes” for not just asymmetric PubKeys but also those primes used for other CS-DRNGs.

As the entropy pool is in effect empty at power up, and embedded systems don’t get much in the way of true entropy sources, the result is that a very very small subset of the potential random numbers generated are actually produced. Small enough to in effect give the SigInt agencies a “list” of likely random numbers to use… When found from the PubKey the SigInt agencies can then go on and work out the likely values used for any CS-DRNGs…

So you can see why using TRNG’s correctly is so important, which when you find out few embedded system designers actually do things correctly should make you think rather carefully about which products you purchase and use.

[1] Back in 1951 according to various Internet sources, John von Neumann first gave voice to,

    Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin

[2] Also in 1951, randomness was finally formalized into a real comercial computer, designed by Alan Turing. The Ferranti Mark 1, wAS shipped with a built-in random number device using termal / electrical noise that generated 20 random bits when called via a specific instruction.

[3] The diehard tests were developed by George Marsaglia over several years and they measure the quality of a RNG via a battery of statistical tests. First published in 1995 on suprise suprise a CD-ROM of random numbers, that had 650 megabytes of random numbers that passed the tests. Importantly they were generated by combining a noise-diode circuit output with a deterministic but chaotic looking source from “rap music”. George Marsaglia called the result “white and black noise”. In essence the result was like the ciphertext output from a One Time Pad.

Curious December 4, 2019 9:00 AM

@Clive Robinson
Oh. Thanks for the feedback.

I thought it had to be easy to manually figure out some huge prime number on paper, but I guess on second thought, it isn’t so easy after all. I mean, I thought it would be as simple as randomly pick some size in decimals for a desired prime number, and then just go at it selecting more random numbers for a larger sequence of numbers, and then do a little work to find some nearby prime number. I would think, being naive here, that a prime number would be not that far off to basically any random even number. But eh maybe such work is hopelessly intricate on pen and paper. But perhaps a calculator could be used for this? Presumably one could trust the calculator if just wanting to check to see if a huge number is a prime number or not. Or, some kind of basic computer could be used, not being involved in generating a prime number, just for checking it to be a prime or not.

Nix December 4, 2019 10:52 AM

@Curious, there are methods used to figure out numbers that are likely ‘prime enough’: look up ‘pseudoprimes’ and the AKS primality test.

MarkH December 4, 2019 12:00 PM

@Curious:

There’s probably a “calculator app” or three you can use, but it’s really convenient to do with Python, either interactively or with a little script you can write.

Python is impressively fluent with big integers; the magnitudes used in public-key crypto are no challenge at all.

A famous result of Fermat is that for any prime n, and integer base b greater than one and less than n-1,

bn-1 mod n = 1

Negating this result is not valid: a remainder of 1 doesn’t prove that n is prime; however, in the vast majority of cases composite n will give a remainder other than 1.

Because it isn’t known at the outset whether n is composite, it’s better that base b be itself prime (otherwise, there’s a much bigger risk of false positives).

Ron Rivest is said to have estimated that if n gives a remainder of one for the first four prime bases: 2, 3, 5 and 7, then the odds against n being composite are something like 10,000,000,000 to one (I don’t remember precisely, his estimate may have been even bigger than that). You can make the risk of false positives even smaller by using a few more prime bases, but passing them never proves that n is prime.

In Python, you can compute the modular exponentiation using the pow() function. The arguments are base, exponent, and modulus (in that order).

Here’s a Python interactive example for prime 5,550,113:

n = 5550113
pow(2, n-1, n)
1L
pow(3, n-1, n)
1L
pow(5, n-1, n)
1L
pow(7, n-1, n)
1L

I happen to know that it’s really prime (got it from a table). But this series of “Fermat tests” shows that it’s very probably prime.

There are better tests, including tests that are also probabilistic (not guaranteed) for which the probability of false positive can be proved to be below a given threshold, and much slower algorithmic tests which always give a correct answer, like AKS as cited by Nix above.

But if you want to do a quick test on a general-purpose calculator, Fermat is the simplest.

Clive Robinson December 4, 2019 12:07 PM

@ Curious,

The problem is checking a prime realy is prime, and it’s a slow process to do regardless of if it’s a computer or pencil and paper.

Thus various tricks are used to reduce the candidates down. Back last century the first was a simple sieve (look at the controversy with Crown Sterling to see a fairly simple one). Then the next was to do a series of probabalistic tests such as the “Miller–Rabin” or similar “primality test” which rule out large swathes of non primes fairly efficiently. Then the hard grind of “Trial division” effectively dividing by every prime below the square root of the number, or similar only slightly more efficient methods, which as they take a very long time for obvious reasons they were very rarely done in practice for primes over 128bits.

Then in August 2002 three computer scientists in India came up with a new way more efficient way of proving primality. Known as the “Agrawal–Kayal–Saxena primality test”, it had a number of estimates on runing time, which was a que for mathmeticians to “dog-pile” on it. Within a year there were a large number of papers and the term “AKS-Class Primality tests” was coined.

However they all fall in the “galactic algorithm” class, which is basically algorithms that are more efficient than others, but only after some point so large –greater than all the atoms in the galaxy– the algorithm will never be used in practice.

But the thing about “primes” is they are not all the same they in effect have personalities where they fall into classes. “2” is obvious because it’s the only even prime, most know about Mersenne primes that are a subset of Mersenne numbers. The thing is that these classes have properties that might or might not be good news depending on your application.

Thus picking primes that don’t fall in some classes but do in others is a quite important consideration in certain applications or domains (cryptography being just one).

One obvious class to avoid is “twin primes” because they fall either side of primordials (think factorials but only with successive primes not integers). These primes are very easy to find because they not only have known points they also reflect around these points making other nearby primes almost as easy to find.

To see this draw up a number line from 1-210 and look around the primordials like 30 and it’s images at 60, 90, 120, and 180. You can use this to build a more efficient sieve than others.

MikeA December 4, 2019 12:30 PM

@Clive: Well, one could proceed by

“hooking the logic circuits of a Bambleweeny 57 Sub-Meson Brain to an atomic vector plotter suspended in a strong Brownian Motion producer (say a nice hot cup of tea)”

although there may be side-effects that include awkward staff parties and interstellar travel.


More seriously, I recall reading a discussion of Von Neumann’s comment. It is possible (even likely) that he was not averse to Pseudo Random Number Generators per se (sometimes using them himself) but was concerned that people would use them for situations where they were not fit for purpose, much as they did (and do) pretend that floating point numbers are in fact “real numbers”, then forget that they were pretending.

Clive Robinson December 4, 2019 12:54 PM

@ MikeA,

say a nice hot cup of tea

I could kill for one right now…

I’ve ended up in Accident and Emergancy this morning and it’s now,18:50UK time… Thus I have not had anything other than water or eaten either.

Hence I am now staring at a blank white wall and starting to have hallucinations of roast chicken and potatoes, carrots, peas, and a nice side order of cabbage on low flying plates chased by silverware.

Before anyone asks, it’s not serious only extreamly painful, which now the pain killer is wearing off that chicken is looking more illusory by the minute 🙁

MarkH December 4, 2019 1:17 PM

@Clive:

Very sorry, that you have such troubles today. I hope that it passes (or at least subsides) in the shortest time.

I wish you the joy of a nice cooked dinner (and some hot tea) as soon as you’re able to manage it.

SpaceLifeForm December 4, 2019 3:30 PM

@Clive

‘But the thing about “primes” is they are not all the same they in effect have personalities where they fall into classes’

This. Powers of 30 are interesting. Not just multiples of 30.

Left-handed vs Right-handed primes are directly related to 30.

I’m sure you know there are more Left-handed primes that Right-handed primes in an interval.

Also, harder to factor semi-primes are Left-handed. IIRC, most of the RSA challenge semi-primes are Left-handed.

For some reason, IMHO, this handed-ness relates to physics. See chiral.

Get well.

MarkH December 4, 2019 5:46 PM

@Ross Snider:

That factor of 3 improvement (roughly) in efficiency of the software is, for me, the dramatic headline for this story.

GNFS was already about 15 years old when RSA-768 was factored. The work done in the ensuing decade has been impressively productive.

Jesse Thompson December 4, 2019 7:19 PM

@SpaceLifeForm

Have you got a link talking about these primes with handedness? At least partially thanks to ambiguous keywords, I’m not pulling up anything that sounds relevant in searches.

I can get “left handed prime ministers” or “chirality of prime, alternating knots“.. but nothing I can see pertaining to number theory. ????

Q December 4, 2019 9:49 PM

News article. Dan Goodin points out that the speed improvements were more due to improvements in the algorithms than from Moore’s Law.

I think speed-up is due to improvements in the implementation of NFS. The algorithm, NFS, hasn’t been improved AFAICT, but the coding and parameter selection for it have been gradually getting better over the years.

Dave December 4, 2019 11:47 PM

One interesting thing this shows is how out of touch theoretical estimates of security are with actual results. We can draw a fairly simple line through factoring results from the last 30-40 years and see what sort of effort will be required in the future, and yet NISTs numerologists are telling us we need to use up to 15360-bit (that’s not a typo) keys to be secure.

Presumably that’s when your attacker is space aliens with time machines or something.

Curious December 5, 2019 6:45 AM

I too would like to know more about left and right handed prime numbers. 🙂

I like reading /watching stuff I don’t understand. Most the math and tech stuff on youtube is beyond me. Luckily, such videos is usually just an hour long or so at most. The notion of ‘chirality’ sounds somewhat familiar to me. Something about how how there appear to be more matter than anti-matter in the universe iirc.

I wonder if there is anything interesting to say about attempting to make a comparison between human DNA and crypto. I can sort of imagine that, maybe if your way of “patterning” data is clever enough, maybe you could pack more data than you would otherwise think. This is ofc, based on my near zero knowledge about DNA (human genome data), but apparently DNA also seems to just work as a mechanism as I understand it, even if perhaps having flaws or errors.

So, eh, one reason I came to think about DNA and crypto, is this other idea of being able to split ciphertext into parts for long term storage and maybe rendering everything non-decipherable unless you have all the parts to start deciphering, that is ofc, assuming that the plaintext wasn’t somehow baked into the ciphertext in such ways, to read it all from any one of numerous fragments of any split ciphertext. Presumably, a split ciphertext would be just text documents with 0s and 1s juxtaposed in correct sequence. So I can at least imagine that having too much overhead re. size of a ciphertext, might be a bad thing again. Perhaps better to include lots of garbled text in the original plaintext, as opposed to some kind of padding (yes, I don’t quite know what I am talking about, but please indulge me), unless that is something that can be detected somehow?

Silly thought, and off topic I guess: How about an “organic” computer (think rebuilding itself internally, as opposed to rely on fixed circuitry), that have everything on its computer encrypted on the fly, then the computer core (whatever that is) keeps a tiny part of all encrypted files, but not so much for opening invividual files up for reading, but for even allowing such decryption to happen in the first place.

Curious December 5, 2019 7:01 AM

To add to what I wrote:

Re. my earlier notion of it being a bad thing if ending up “baking” ones plaintext into multiple instances inside a ciphertext so that you could decipher it all from a fraction of a given piece of ciphertext, this again reminded me of how I sort of don’t trust homomorphic enryption, or rather, the general idea of it, and more importantly, the way I recklessly imagine fully homomorphic crypto to be; because if homomorphic encryption ends up being an amorphous mass of ciphertexts, how can one be sure there aren’t compromosing details from surveillance or unknown person-related data baked into parts of the ciphertext from the big heap of data that one is decrypting from? Even if one believed any specific depryption was limited to say, just-some-minute-data about something, the entire thing could perhaps(?) contain more types or entire contents of data encrypted that you could imagine, in some encryption blob regime in the future as I imagine it (think cloud computing).

MarkH December 5, 2019 8:59 AM

@Jesse, Curious:

I’m guessing that SpaceLifeForm was referring to truncatable primes, an obscure little corner of number theory.

It deals not with inherent properties of numbers, but rather their representations as numerals. They are primes with the property that if you iteratively delete a digit from one end (or for one category the pair of digits from both ends), the truncated numerals are also primes.

From wikipedia, 9137 is a left-truncatable prime “since 9137, 137, 37 and 7 are all prime.”


It can easily be seen that this property depends on the radix (base) used to write the numeral. Each of the several categories is finite in number; for example, in decimal (radix 10) notation, “there are exactly 4260 left-truncatable primes, 83 right-truncatable primes, and 920,720,315 left-and-right-truncatable primes.”

Perhaps “Space” was referring to left-truncatable primes (in decimal) being about 50 times as numerous as their right-side counterparts.


For me, this looks like rather a mathematical dead end, though it might be amusing to explore the asymmetries. At first I thought that the determination of the sizes of these finite sets might involve some interesting proof, until I realized that they are quite simply computed by exhaustion.

The computation is surely time consuming for the left-and-right case, but it wouldn’t need much time to write the code.

I think my favorite variant are the “two-sided” primes, which are both left-and right-truncatable. This set is trivially obtained as the intersection of those two sets; in decimal there are only 15, the largest being 739397.@Jesse, Curious:

I

MarkH December 5, 2019 9:26 AM

@Q:

To agree with your observation, but in a more nuanced form … [disclaimer: my understanding of the NFS is microscopic]

It seems to me that although the Number Field Sieve is called an algorithm, it’s more like a general strategy.

More traditional algorithms like Euclid’s (for greatest common divisor) or AKS (deterministic primality test) imply a definite sequence of operations.

On the other side, ECPP (Elliptic Curve Primality Proving) and NFS specify an approach to the problem, but leave a variety of open decisions.

My clumsy metaphor, is to imagine the problem of traveling between two addresses on opposite sides of some great city. If someone devised a precise sequence of directions (turn here, embark there) that would be more like a traditional algorithm. In contrast, if someone determined that “the fastest way during rush hour is to travel some segments by taxi and others by subway/underground/metro,” that would leave quite a lot of open decisions concerning how specifically to make the trip.

In the 90’s when I was reading about Primo (ECPP software), I recall learning that the set of curves chosen by the designers was hoped to work for all numbers, but they had no way to prove it. In other words, if it proved a number prime, you could be sure that was correct; but a failure to prove might possibly be the case of a prime which wasn’t compatible with the selected family of curves …

NFS is enormously complex to carry out: RSA-240 was factored via 300,000 lines of source code. I imagine that development entailed whole forest of decisions to make and tactics to choose, with numerous complex steps each requiring their own algorithms to implement.

Probably the terrific speed-up achieved by these workers is the fruit of many, many hours of analysis and experimentation to find improved methods at for many corners of this vast computation.

SpaceLifeForm December 5, 2019 3:56 PM

Left-handed and Right-handed numbers

Left-handed: L = 6x – 1
Right-handed: R = 6x + 1

We don’t care about numbers mod 2 or mod 3.

This is about mod 6.

While one can visulize this on a number line, treating the L numbers as negative, I’ll show the reason why I call them L and R. I explain the number line angle below.

Two columns of numbers:

L 1 R
5 7
11 13
17 19
23 25
29 31
35 37
41 43
47 49
53 59
59 61
65 67
71 73
77 79
83 85
89 91
95 97
101 103
107 109

Note the ‘1’ between the L and R column headings.

Now, study the patterns.

An L * R = L
An L * L = R
An R * R = R

Note that even powers are always an R,
but an odd power of an L is an L, and an odd power of an R is an R.

It is not symmetric.

There are more composite R than composite L in a range.

Leaving aside 2 and 3, because this is about L and R forms, note that this triple of twin primes is the only triple in existence.

5,7 11,13, and 17,19

It is impossible for there to exist any more triples of twin primes.

You can see why by inspection mod 5.

But, I believe there are infinite doubles of twin primes. The first example here being 101,103 and 107,109.

Note that 41,43 and 47,49 could be a candidate double (see the mod 5 boundaries), but it gets knocked out because of the 49, 7 squared.

The next candidate block for double twins is 71,73 and 77,79 but that is knocked out via 77.

Again, note the 1 between the L and R column headings.

Lets look the first composite L, 35.

It’s factors being 5 and 7.

Starting with the L of 5, if you count down the left side 5 times, you reach 35.

Starting with the 7, if you count up to the 1, and then back down the L side, 7 times, you reach 35.

Another: 55 = 5 * 11
This is L * L = R

In this case, you count from the 11 up the L side, thru the 1, and back down the R side. 11 steps. You reach 55.

But, for the 5 to reach, you count up thru the 1, then back down. Rats. You only reach 25.

Must count 5*2 steps.

Where did the 2 come from?

6*2 – 1 = 11

Now, imagine a number line, shortened here for brevity, and consider the L numbers as negative on the number line:

… 35 29 23 17 11 5 1 7 13 19 25 31 37 …

Can you spot the sine waves?

How the waves miss the primes?

How true semiprimes are hit by two different waves?

Weather December 6, 2019 11:40 PM

Would it make sense to say every one million prime put one in a file, next time you have a number do so foumla like count up ,if it lands on one of those number in the file you know its a prime, save doing a full rainbow table.

Clive Robinson December 7, 2019 6:45 AM

@ Weather,

Would it make sense to say every one million prime put one in a file

What is infinity over a million?

It would be an infinite size file.

But you don’t need to do that. As I’ve mentioned you want to avoid “twin primes” as the usually sit astride a primorial[1] which makes them “Primorial Primes” which are fairly easy to find without needing a lookup table in a file.

But the problem stretches out from Primorial Primes, as I’ve also mentioned their pattern of location reflects around not just the main primorials but what are the sub primorials.

So 30 (2x3x5) and 210 (2x3x5x7) are primorials and “29,31” and “209,211” are their Primorial Primes that are also “Twin Primes”

However 210/30 is 7 and thus there are 7-2 sub primorials or reflection points of 60, 90, 120, 150, 180. The numbers either side of these also tend to be primes or twin primes.

If you make a number line from zero to 420 and mark the primes off you will see that the reflect around the primorials and their multiples. Thus if you work up from 210 you will find the primes match location wise as you move down from 210. If you do the same with 30 you will see a reflection around it, but you will then see another reflection around 60.

Obviously you loose a few primes along the way, but when you get into realy big primorials of the first fifty or a hundred primes (which you can store in a file) you can then build a “location” file of the locations around a reflection and use this to build a fairly fast sieve around primorials and their sub reflections and sub sub reflections etc.

Whilst it makes finding “probable primes” much faster than simpler sieves[2] it makes an adversaries job simpler as well thus it’s best to avoid the “close in” primes to the primorial points.

[1] The term “primorial” was thought up by Harvey Dubner[3] akin to the more common factorial which is formed by multiplying each positive integer in succession and is something all K12 students should be aware of. The Primorial works the same way, except instead of using the first n natural numbers it uses the first n primes in succession. It is sometimes written as Pn# for convenience. Thus more formally the nth prime number pn, the primorial pn# is defined as the product of the first n primes.

[2] Like all sieves they are conceptually simple, and at low values around either factorials or primorials they are faster, but the advantage quickly slips behind just random number guessing and probability tests.

[3]who sadly dird on 23rd Oct this year. He had quite a number of claims to fame including writing up the first “card counting” method, and at one time holding the record for finding the most primes over 2000 digits. He has also come up with a number of “number sequences” of which primorials are one (see sequence A002110 in the OEIS)

MarkH December 7, 2019 10:40 AM

@Weather:
@Clive:

How to think about finding and testing primes depends in part on how big they are, and whether the purpose is mathematical recreation or for cryptography.

A table of every millionth prime (what Weather was proposing, if I understood) is impossible for crypto, as Clive implied.

Suppose I want a pair of secret 1024-bit primes to make a 2048-bit RSA modulus. If my math isn’t too rusty, there are about 10^305 primes that require exactly 1024 bits to write, so a list of one millionth of these would need about 10^299 entries.

I’m indebted to Clive for his mention of primorials. Steering clear of adjacency to them shouldn’t be a problem for crypto, because their density is extremely low.

If I did my arithmetic correctly, there are no primorials which need exactly 512, 768, 1024, 1536 or 2048 bits to represent, so one needn’t lose sleep about bumping into them.

And even if you’re using some custom non-standard key size, the probability of randomly landing close to a primorial (for numbers of PKC magnitudes) is smaller, than the probability that you will buy single tickets for 30 multi-million dollar lotteries and win every one of them.


Clive, I hope you’ll forgive me for positioning you “under the weather” above … it’s my silly way of acknowledging your ordeal, and I think I can presume to speak for your loyal readership in saying:

a) we await news of your improving condition,

and

b) we salute you for continuing to post here during the thick of it!

SpaceLifeForm December 7, 2019 1:29 PM

@Clive, MarkH

Glad you are back Clive. And for expounding.

“Steering clear of adjacency to them shouldn’t be a problem for crypto, because their density is extremely low.”

Bad assumption.

Re-parse what Clive wrote.

“Obviously you loose a few primes along the way, but when you get into realy big primorials of the first fifty or a hundred primes (which you can store in a file) you can then build a “location” file of the locations around a reflection and use this to build a fairly fast sieve around primorials and their sub reflections and sub sub reflections etc.”

It’s not simple. This is why good semiprimes that are large, and are also difficult to factor, require that the two factors, P and Q, are not only not close to each other, but also fall into different primorial classes.

It’s not that the twin primes thin out, it’s how you can use a primorial to look at semiprimes not (relatively speaking) too far away on the number line.

Relatively speaking. Obviously, the work involved is large with large numbers, which requires extremely large primorials.

You want P and Q to not be too close to the squareroot of N (the semiprime), but you also do not want P to be small relative to Q.

P and Q should be within 2 to 4 magnitudes of each other. Not closer, not further.

SpaceLifeForm December 7, 2019 3:37 PM

@ Clive

Besides my thought of infinite double twin primes, here’s a really old one:

(old, as in 5 decades ago)

If P is prime, then (P-1)! + P is prime.

I can not even recall how I came up with that.

But, I never found it to be false.

MarkH December 7, 2019 7:58 PM

12! + 13 = 29 • 2503 • 6599

The relation holds true for the first 5 primes; among the next 100 primes, the expression is composite for all but one: 52! + 53 seems to be prime.

It would be interesting to explore whether it can be proven that the number of primes of this form is finite, or infinite.

SpaceLifeForm December 8, 2019 12:43 PM

@ MarkH

Well, there you go. My Quick and Dirty tests with old software were flawed.

Basic on PDP-11. Floating point fail.

Modern software obviously works better.

But, I do not trust GMP. Have seen issues.

MarkH December 8, 2019 7:30 PM

@SpaceLifeForm:

Well, there’s a long history of seeking out formulas for primes. Both Fermat and Mersenne numbers were suspected (or hoped) to be always prime. In both cases the first few in the sequence really are prime, and after that the bigger numbers were very difficult to test in the days before computing machinery.

The proof attributed to (or at least written up by} Euclid of an infinitude of primes is constructive (it shows how to make a new bigger prime than those you already know), and is in fact a formula for primes which always works.


Making large-integer experiments using floating point was always going to be very difficult … but 50 years ago, doing integer computations for very large numbers took a lot of programming effort, and would quickly run into resource limits.

It’s super convenient to use Python for numerical tests. My program to check the formula needed less than 45 Python statements.

I don’t have any direct experience with the Gnu Multiple Precision (GMP) library. Roughly 15 or 20 years ago, I rolled my own big integer library. Sometime later, I saw that Python modular exponentiation was very dramatically faster than my own. In those days at least, Python had its own code for big integer math, and I learned a lot from studying it.

Chuck Pergiel December 16, 2019 5:09 AM

4000 core years? Since it didn’t take them 4,000 years, I assume they used multiple cores. How many? 4,000? 40,000? 400,000? I suspect that probably cost a couple of dollars. Who funded this exercise?

Electron 007 December 23, 2019 5:55 PM

So in this case

N =~ 1.24 x 10^239

p =~ 5.09 x 10^119

q =~ 2.45 x 10^119

==================

Now I am interested in factoring

ϕ(N) = (p-1)(q-1)

Since (p-1)/4 and (q-1)/2 are relatively prime odd integers (better double-check!) we should have

ϕ(ϕ(N)) = ϕ((p-1)/4)ϕ(q-1)/2)ϕ(8).

This is then of interest because it is the size of the group of possible cryptographic keys under the RSA scheme. What is a minimal generating set for the multiplicative group modulo ϕ(N)?

If ϕ(N) would have been easy to factor, then it seems the problem would have been easy to attack by discrete logarithm methods. Is there is good explanation or comparison of current best well known strategies out there?

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.