Schneier on Security
A blog covering security and security technology.
« The Doghouse: Privacy Inside |
| 1777 Steganography »
October 14, 2009
The Current Status of P Versus NP
Posted on October 14, 2009 at 7:37 AM
• 36 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
"We could search all possible pairings but even for 40 students we would have more than 300 billion trillion possible pairings."
Could anyone explain this ? Not 780 ?
@Rin - at that point i realized it's over my head, and read through the rest just because it's so interesting.
You start with 40 students and choose 2 of them. 40 choose 2 = 780. Then you choose 2 from 38, which is 703. So for the first two pairs you have 780*703 possible outcomes. Do that 18 more times.
something like 40! /(2^20 20!) I guess
Wouldn't it be 39! possible pairings? 39 possible partners for the first, * 38 possible pairings for the second, * 37 possible pairings for the third...
I'm not sure that's exactly right, but a number much much larger than 780 definetly makes sense to me.
Disregard that then, noah is right.
you need to divide your 40 students in 2 groups of 20: 40!/(20!)^2 possibilities
you order your students 1, 2, ..., 20 in group A, and you pair each one with a student in group B: 20! possibilities
you need to divide by 2^20 because you don't care whether a student belongs to group A or B
Finally, you get 40! /(2^20 20!) = 3.2 10^23
360 (40 * 39/2) is the number of pairs, not the number of pairing, which is 39*37*35*...*7*5*3*1
For example, if we have 6 student (ABCD), the pairings are
AB, CD, EF
AB, CE, DF
AB, CF, DE
AC, BD, EF
AC, BE, DF
AC, BF, DE
AD, BC, EF
AD, BE, CF
AD, BF, CE
AE, BC, DF
AE, BD, CF
AE, BF, CE
AF, BC, DE
AF, BD, CE
AF, BE, CD
and the pairs are
AB AC AD AE AF BC BD BE BF CD CE CF DE DF EF
Coincidently, for n=6, the numbers are the same (15). A better exemple would have been with higher n, with e.g. 105 pairings and 28 pairs for n=8, but it starts to be too big for me to make by hand...
I'm of an age where I was at school when NP first came up...
And I remember some one (not me) reworked the Shakespear "piss take" of,
To pee or not to pee...
2 P or Not 2 P...
Such are ones school memories 8(
I can understand the confusion with the number of pairings...
Because he made the comment about "pairs of compatable" students...
If we assume that we infact have two groups one of 20 men and one of 20 women and a pair has to be one from each group...
But you could then say that there could be say four groups of ten etc...
It is quite simple to calculate the possible pairings (if your calculator or head can stand it). Take any one student. They have a choice of 39 possible partners. Next, there will be 38 students left. Take one of those who has a choice of 37 partners amd so on. So the number of possible pairings is:
39*37*35 ... *5*3 =
Does P=NP imply a general means to reduce algorithms of high polynomial complexity to a lower polynomial complexity?
If not, then the article makes a number of false claims to "sex up" the subject. Polynomial time does not mean fast, as anyone knows who has ever compared bubble sort with introsort on a decent-sized set of data.
All that stuff about how if you prove P=NP, then we'll live in utopia, is rubbish. Suppose you find an algorithm for an NP-complete problem which is Theta(N^googelplex). Then the fact that P=NP is useless.
Existing O(2^N) algorithms would be faster than the new polynomial-time algorithms for any N we'll encounter in our lifetimes. And unless you believe computing power increases exponentially, forever, in defiance of current theories of entropy and information, then it will always be useless.
Yes it does to your first question. More or less.
Your arguments are correct, in the same way that a runtime of Exp[N] can be better than googleplex N^2. However, it turns out that for the algorithms we know, it IS useful to decide between polynomial and exponential, because the former usually is much faster than the second.
Furthermore, even if we would proof P=NP, that doesn't mean we know HOW to reduce exponential to polynomial - only that it exists.
@SteveJ: "Polynomial time does not mean fast, as anyone knows who has ever compared bubble sort with introsort on a decent-sized set of data."
However, for a decent-sized set of data, polynomial will mean "faster" than non-polynomial. The statement is purely relative.
For example, if you use "BruteSort" as a comparison, "BubbleSort" will seem blazingly fast.
(BruteSort is something which I just made up. Roughly speaking, it would enumerate through all possible permutations of the dataset and then check if the permutated set is sorted.)
That's the Jargon File btw...
@Frank: "Furthermore, even if we would proof P=NP, that doesn't mean we know HOW to reduce exponential to polynomial - only that it exists."
In this case actually existence leads directly to construction - if we know that a polynomial-time algorithm exists, then we can provide one.
The proof is fairly simple. Take every single program that exists (for some Turing machine), and number them 1, 2, ...
Execute the first step of program 1 on the specified input, followed by the first step of program 2 (on a separate copy of the input for each program), followed by step 2 of program 1, followed by step 1 of program 3, followed by step 2 of program 2, and so on in a "triangle".
If any program halts, then check (in polynomial time) whether its output is a solution the the NP-complete question in question.
Then if any polynomial-time algorithm exists to solve the problem, the procedure described above is a polynomial-time algorithm.
@Paeniteo: "for a decent-sized set of data, polynomial will mean "faster" than non-polynomial."
For a *sufficiently large* input, polynomial is faster than non-polynomial. The reason I mentioned Theta(N^googolplex) complexity specifically, is because if we were to compare a polynomial algorithm taking exactly N^googolplex steps, with an exponential algorithm taking exactly 2^N steps, then N has to be considerably greater than a googolplex before 2^N > N^googolplex. So the exponential algorithm is certainly faster for input data up to a googolplex bits in size.
Your triangularization is cute, but not very practical, because the constant factor ignored by the O() complexity notation will depend exponentially on the _size_ of the polynomial-time Turing machine you search for using the triangularization. (Assuming that the enumeration procedure for the Turing machines is enumerating them based on size, smaller machines first).
Or am I missing something?
Could someone explain to me why the author of the article states that public-key cryptography would be impossible if P=NP? Especially given that he explains in a different place in the article that the hard problems used in current systems (factoring, discrete log) aren't even expected to be NP-complete?
From the articles examples of NP problems: "Finding a DNA sequence that best fits a collection of fragments of the sequence (see Gusfield20). " Doesn't this imply the brute force conception of evolution is not mathmatically sound?
@Gordon: GA's (Genetic Algorithms, aka Evolutionary Programming) get stuck in local minima all the time. They are an efficient way of searching a large space. They are not guaranteed to converge to the optimal solution. (In nature, good enough is, well, good enough.)
Both the Moose and the Hippo occupy the same ecological niche. They eat similar foods. They don't exactly look alike though. Even accounting for the climate differences.
@Frank: If we had a polynomial-time solution to one NP-complete problem, we could use it to design a polynomial-time solution to any problem in NP. You prove a problem X is NP-complete by proving that if you have a polynomial-time algorithm to solve X you therefore have one to solve Y, where Y is NP-complete.
@RonK: If, as most people believe, P is a proper subset of NP, then there's classes of problems between P and NP-complete with complexity that scales up. Therefore, if P != NP, a given problem doesn't have to be either in P or be NP-complete, and there's at least some reason to believe that factorization doesn't have a polynomial solution.
On the other hand, if P=NP, then all problems in NP have polynomial-complexity solutions, and that does include factoring and discrete logs.
@D: Given the complexity of DNA, "good enough" is still a very small subset of the possibilities, so evolution would still seem to be a complex NP problem, as the article suggests. A single molecule out of place can cause severe/fatal damage (e.g. Sickle Cell Anemia). It is not clear why an NP problem becomes solvable by brute force when there are more than 1, but still relatively few, good solutions. Even on a billion year time frame with a billion trials per year, you would still need more than a thousand such iterations to solve the relatively simple 40 student problem mentioned at the beginning of the article. I guess I am trying to figure out some way to fit the math of evolution into this framework that makes sense. Rigth now I don't see it (I realize that may be of interest to no one but me).
OK, one more post then I'll leave it alone. Conjecture: DNA uses some sort of an algorithm to keep mutations in the subset that yields a more or less viable life form, making the genetic underpinnings of natural selection mathematically possible. One could then argue that since evolution happens, P=NP.
Wow, the utopia that they describe sounds just like campaign promises during the election year.
If you elect me... I will prove, that P=NP. The world will be better and brighter. Now is the time for change. The other canidates say we can't do it... to that I say "Yes we can."
Ah, thanks, I had forgotten that checking whether a factorization was correct was poly-time. It still strikes me as a strange assumption that public key encryption _requires_ a hard problem which is not in P. I vaguely remember a paper which proposed a (not very practical) poly-time public-key cryptosystem where there was a provable gap in the order of the decryption complexity between an attacker without the private key (having O(n^2) complexity?) and with the private key (having O(n) complexity?). Unfortunately, I don't manage to find this paper anywhere.
It seems to me possible to have a relatively workable public-key cryptosystem based on the difficulty of a problem with complexity O(n^20) vs. O(n^2), for example, which could provide *a lot* of the functionality which we have today. There would be a lot more headache to constantly roll-over keys, update the security parameters, and have to deal with very large keys and messages, but it would still be possible.
Gordon: Nope -- evolution has no problem with throwing away non-viable eggs/spores/embryos. Even for humans, a significant fraction of conceptions are non-viable -- typically, the woman never knows she was pregnant. For lower orders... well, when you're producing 1000 eggs in a season, even, say, 25% failure up front still leaves 750 chances to evade predators and so on.
Public key crypto doesn't require P!=NP or anything like that - I'm not sure where that myth comes from.
All it requires that decryption without a private key took so much longer so as to make it impractical while encryption is still fast.
I largely agree with your comment (see my comment above) but there _are_ certain properties of public key crypto today which we would probably lose, like good forward secrecy for one-sided messages (i.e., messages generated without a key exchange dialog). This is because computing power is still increasing exponentially with time, so limiting the difficulty of decryption to poly-time is a problem.
@David-Harmon: A single human chromosome is 22 million base pairs long. A vanishingly small set of potential orders of that DNA is viable. If every mamal on earth had a thousand trials per year, it would still take something on the order of 2^1,000,000 years to sort through finding a single viable life form by brute force (depending on your assumptions). That nature has no problem discarding non-viable life forms does not mean it can brute force its way through these enormous numbers. It must limit itself in some way to a potentially viable subset.
Despite doing a computer science degree (it was 20 years ago, so a lot of it is rather hazy now), a lot of that article was gibberish to me. Can Bruce or anyone recommend some good background reading to help me make better sense of it?
I'm sure it would help me to be a better programmer as well. Since most of the modern programmer's time is spent understanding and manipulating provided APIs rather than developing new algorithms, the mathematical side of my programming is rather weak.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.