Saturday, April 02, 2016

Don Knuth's argument supporting his belief that \(P=NP\)

Almost a year ago, Scott Aaronson wrote an essay about mysteries in mathematics for Now, a very similar essay by the same author was published there:
The great mystery of mathematics is its lack of mystery (by Aaronson)
And Luke and Mel have pointed out that without compensating me in any way at all, Aaronson has sextupled the attractiveness of his essay for the readers by having used the name "Motl" five times.
Two years ago, the Czech string theorist and notorious conservative blogger Luboš Motl accused theoretical computer scientists such as me of believing the ‘\(P\neq NP\)’ conjecture – a central unproved hypothesis about the limits of efficient computation – as a matter of groupthink and ideology, of having no rational grounds for our prejudice.

By itself, that accusation isn’t so remarkable; Motl certainly isn’t alone in his opinion. But he went further. Although he conceded that, in continuous math close to physics, there can be reasons why statements are true, Motl claimed that as you get further away from physics, math becomes just a disorganised mess of propositions.

...But if a statement hasn’t yet been proved or disproved then, in Motl’s view, there’s no way even to guess, better than chance, which way it will turn out...

...A priori, math could have been like Motl said it was, ...
He hasn't conveyed my views and arguments about \(P=NP\) accurately but given the brevity of the sketch, it's probably good enough.

Note that Aaronson also defined me as a "notorious conservative blogger". Well, I am used to almost identical labels from the communist totalitarianism. Nasty left-wing ideologues are basically the same across the ages. They are always eager to place politics above everything else and demonize everyone who realizes that their political views are just stinky piles of feces. I fought against immoral leftists of Aaronson's type for many years and you may call me a national hero for those reasons.

At any rate, the arguments that \(P=NP\) is in no way "almost proven" don't depend on someone's being conservative in any way. In fact, Mel sent me an answer about \(P=NP\) by a famous computer scientist that I immediately decided to be the most insightful and detailed argument about the validity of \(P=NP\), in one way or another, that I have ever seen.

The person who, like Mel, believes that \(P=NP\) is probably true is Donald Knuth, the father of \(\rm\TeX\) and METAFONT. He's the #1 man responsible for the fact that mathematical symbols and equations look great in papers about mathematics, physics, and computer science – and (thanks to the MathJax realization) on this blog, too.

In May 2014, people from his field asked
Twenty Questions for Donald Knuth
An e-book called TAOCP (The Art of Computer Programming) was just published and many people asked various technical questions related to e-books. But the seventeenth question was about \(P=NP\):
17. Andrew Binstock, Dr. Dobb's: At the ACM Turing Centennial in 2012, you stated that you were becoming convinced that \(P = NP\). Would you be kind enough to explain your current thinking on this question, how you came to it, and whether this growing conviction came as a surprise to you?
I hope it's OK when the full reply by Knuth is copied here:
Don Knuth: As you say, I've come to believe that \(P = NP\), namely that there does exist an integer \(M\) and an algorithm \({\mathcal A}\) that will solve every \(n\)-bit problem belonging to the class \(N P\) in \(n^M\) elementary steps.

Some of my reasoning is admittedly naïve: It's hard to believe that \(P \neq NP\) and that so many brilliant people have failed to discover why. On the other hand if you imagine a number \(M\) that's finite but incredibly large—like say the number \(10\uparrow\uparrow\uparrow\uparrow 3\) discussed in my paper on "coping with finiteness"—then there's a humongous number of possible algorithms that do \(n^M\) bitwise or addition or shift operations on \(n\) given bits, and it's really hard to believe that all of those algorithms fail.

My main point, however, is that I don't believe that the equality \(P = NP\) will turn out to be helpful even if it is proved, because such a proof will almost surely be nonconstructive. Although I think \(M\) probably exists, I also think human beings will never know such a value. I even suspect that nobody will even know an upper bound on \(M\).

Mathematics is full of examples where something is proved to exist, yet the proof tells us nothing about how to find it. Knowledge of the mere existence of an algorithm is completely different from the knowledge of an actual algorithm.

For example, RSA cryptography relies on the fact that one party knows the factors of a number, but the other party knows only that factors exist. Another example is that the game of \(N \times N\) Hex has a winning strategy for the first player, for all \(N\). John Nash found a beautiful and extremely simple proof of this theorem in 1952. But Wikipedia tells me that such a strategy is still unknown when \(N = 9\), despite many attempts. I can't believe anyone will ever know it when \(N\) is \(100\).

More to the point, Robertson and Seymour have proved a famous theorem in graph theory: Any class \(c\) of graphs that is closed under taking minors has a finite number of minor-minimal graphs. (A minor of a graph is any graph obtainable by deleting vertices, deleting edges, or shrinking edges to a point. A minor-minimal graph \(H\) for \(c\) is a graph whose smaller minors all belong to \(c\) although \(H\) itself doesn't.) Therefore there exists a polynomial-time algorithm to decide whether or not a given graph belongs to \(c\): The algorithm checks that \(G\) doesn't contain any of \(c\)'s minor-minimal graphs as a minor.

But we don't know what that algorithm is, except for a few special classes \(c\), because the set of minor-minimal graphs is often unknown. The algorithm exists, but it's not known to be discoverable in finite time.

This consequence of Robertson and Seymour's theorem definitely surprised me, when I learned about it while reading a paper by Lovász. And it tipped the balance, in my mind, toward the hypothesis that \(P = N P\).

The moral is that people should distinguish between known (or knowable) polynomial-time algorithms and arbitrary polynomial-time algorithms. People might never be able to implement a polynomial-time-worst-case algorithm for satisfiability, even though \(P\) happens to equal \(N P\).
I've written pretty much all these arguments before but it sounds more exciting when a famous computer scientist does the same thing, right?

Just to be sure, Knuth's story isn't a complete proof and I am still 50-50 uncertain about the truth value of \(P=NP\). But I think that Knuth's monologue is some detailed story and the believers that \(P\neq NP\) actually don't have any comparably strong detailed story about their belief. As far as I can say, all of their "arguments" are circular reasoning, realizations of the idea that "many times repeated proposition becomes true". To solve an \(NP\) problem must be hugely, and therefore non-polynomially, more difficult than to verify a solution, and therefore \(P\neq NP\), and therefore it's hugely more difficult to solve it, and so on. There is clearly no argument except for their desire to mindlessly believe in the answer they have predecided to be "true".

But as I said, and Knuth clearly says exactly the same thing, \(P=NP\) is really a messy technical statement whose proof wouldn't give the humans any new supernatural powers. In other words, it's silly to pretend that it is a holy grail that will change the world (more than \(\rm \TeX\) did). It probably won't. First, \(P=NP\) says that the difference in the number of steps between "a solution" and "a verification" is "just" power-law. But power-law functions may still reach huge values. The separation to polynomial and "higher than polynomial" functions is more or less an arbitrary technicality without obvious deep theoretical implications. You may see that Knuth agrees with this point. It is probably what drives the \(P\neq NP\) cultists such as Aaronson up the wall – because they spent long years with this polynomial vs. nonpolynomial separation and that's why they want to believe that there's something absolutely fundamental about it – although they don't have real evidence for this claimed importance.

So, as Knuth said and as I and others wrote it before him, the exponent \(M\) that separates the "solution" and "verification" algorithms for the \(P=NP\) problems may be a very large, or insanely large, number. And \(n\) to the googolplexth power isn't too much smaller than \(\exp(n)\) – for most numbers \(n\) you care about, the power is actually larger than the exponential. So just the fact that a function is "polynomial" doesn't mean that "it may be considered small in practice".

But even if the exponent \(M\) weren't insanely large, there is still one huge problem that would basically neutralize the practical implications of the proof that \(P=NP\): the first proof of \(P=NP\) would almost certainly be non-constructive, i.e. existential, in character. An algorithm may exist that extends a "verification algorithm" to a full "solution algorithm", just by adding this (possibly huge) polynomial "overhead" to the time that it takes to run the algorithm. But if we prove that such an algorithm exists, it doesn't mean that we actually know what the algorithm is – let alone that we are ready to use it.

The \(P\neq NP\) cultists dogmatically present "a polynomial algorithm to solve such problems" to be "too ingenious a thing", and this amount of ingenuity just isn't allowed (to humans or to anyone else), they think. The non-existence of such "fast" (which don't have to be "fast" in any practical sense, as I mentioned) algorithms is what "protects the order of the world against the totally systemic hackers", so to say. But this isn't needed. Even if some protections exist at all, and it's not actually proven, there may be protections at many different levels. The "fast" algorithms to solve the \(P=NP\) problems may exist if \(P=NP\) but they may be insanely difficult to be found. They may also be long. And their logic or "philosophy" doesn't have to be comprehensible to humans because you know, the set of "mathematically possible syntactically correct algorithms" – even its subset of those that "also happen to be very useful for something" – is almost certainly much larger than the "set of algorithms that men constructed with a goal in mind, while they knew what they were doing".

Now, let's look why Knuth actually believes that \(P=NP\), arguments that go beyond the qualitative excuses above – comments that show that both pictures are plausible. His reason is the observation that the total number of algorithms to manipulate with \(n\) bits is immensely large. You might agree that it's "hard" to quickly enough connect \(n\) cities by the shortest trajectory. So there's a "seemingly powerful entity that prevents everyone from solving the problem efficiently". The "apparent power" boils down to the people's inability to find a specific polynomial algorithm (or an existential proof that exists) so far.

But against this power, or divine existence, if you wish, there is another titan: all algorithms that in principle exist. When we ask whether \(P=NP\) is true, the "Yes" side contains all algorithms, known and unknown ones, and these are allowed to team up by picking their best warrior. This is a huge military force as well (yes, the war takes place in the heaven) – and, as Knuth believes, it's a superior force ready to win the war. You know, the power of this "Yes" side of the war about \(P=NP\) boils down to the fact that people haven't classified or eliminated (as impotent to do the \(NP\) task) all mathematically possible algorithms. It sounds reasonable that the task to "take control over all mathematically possible algorithms and be able to show that all of them are impotent" is arguably an even grander task than to connect \(n\) U.S. cities by a line, isn't it?

Folks like Aaronson have brainwashed themselves so much by the prayer that "it's so unbelievably difficult for a traveling salesman to visit all \(n\) cities in a list and save fuel" that they don't even see that while defending the dogmatic belief in \(P\neq NP\), they are basically saying that "it's not difficult to have an idea what all mathematically possible algorithms in the Platonic universe do". If that's the case, surely a traveling salesman could quit his job and be hired a "master of all possible ideas and algorithms" tomorrow. Well, I, for one, find it much more sensible for Aaronson to quit his job and clean the toilets tomorrow.

If you could actually run the hypothetical algorithm, it would apply \(n^M\) steps on \(n\) bits. If you only studied what happens with the computer's RAM – its history – the number of similar histories could be something like\[

\Large n^{n^M}

\] or much more than that because the programs are allowed to use much more RAM. That's a lot of possible histories that have a chance to find the best solution – and a proof that it's the right one. Even for relatively modest values of \(n\) and \(M\), the "double exponential" above is a pretty huge number.

The hypothetical algorithm to solve the \(NP\) problem(s) has to be a fixed algorithm. But it may be a very long algorithm whose length may exceed the number of histories – the double exponential above – for reasonable values of \(n,M\), like \(248\). Such a long algorithm may arguably deal with "all kinds of exceptions" that may occur for limited values of \(n\) – and the rest of the work may be solved by straightforward brute force that takes a polynomial time.

Let me make a specific point about what a \(P=NP\) performing algorithm is allowed to do. A calculation is always composed of many pieces that "do something with 1 GB of RAM". A \(P=NP\) algorithm is allowed to consider a table of all possible "initial states of 1 GB of RAM" with the assigned "final state of this 1 GB of RAM", skip the calculation, and jump over googolplexes of steps by using this "index". If the \(P=NP\) algorithm uses this trick, the \(P=NP\) claim basically says that a finite index of this sort is sufficient to speedup the algorithms to "in principle polynomial ones". Is it possible or not? I think it's obvious that in the absence of an actual proof (or a sketch that is likely to be completed soon), no one has any "real" evidence in one way or another. It's all about prejudices.

I think one could say that in this interpretation, \(P=NP\) is analogous to the claim of "renormalizability" (in the sense of quantum field theory) of the "Feynman diagrams" for \(NP\) problems: there are just "finitely many types of complications" that are responsible for making the problem seemingly "polynomially unsolvable", and these finitely many exceptions may be replaced by some index, accelerated, and the conversion to the polynomial algorithm is completed. Is the set of complications that make it hard to solve \(NP\) problems "renormalizable"? Nobody knows, especially because the right exact question hasn't even been well-defined so far. ;-)

But the algorithms may do far more creative and "recursive" things than what any particular simple story suggests. Even rather short algorithms may display some "recursive cleverness". And even algorithms on the computers today mimic human brain and all kinds of clever methods that humans use while thinking about and recognizing things.

Now, combine this freedom to build algorithms with the unlimited allowed size of the code. (It still has to be finite, i.e. independent of \(n\), but the size may be really huge.) Even a computer program that fits to 4 GB of the RAM of your laptop can do many things you can't imagine. The number of possible programs in 4 GB is something like\[

\Large 2^{3\times 2^9}.

\] These programs may do many things you haven't seen in your life and that you haven't even imagined. They may cleverly rewrite and improve themselves in many different ways. The algorithm may be a clever meta-program that writes a different program designed according to the data, and that different program writes yet another program, and perhaps several times, and the final program only starts to really do something with the data. In the absence of any proof, does someone really claim to have any sensible evidence that no clever meta-meta-meta-program of this kind is capable of doing the task?

And in reality, programs may be much longer than 4 GB (unlimited) and they're also allowed to operate with an unlimited RAM for all the intermediate calculations (although the amount of RAM will basically be forbidden from growing with \(n\) too quickly because you need many steps to deal with too huge RAM).

And the task for a program establishing \(P=NP\) is just (Knuth would say "just") to find a sequence of steps that connects \(n\) cities by the shortest path plus a proof that it's the best way in less than\[

\Large 10^{100}\times n^{10^{100}}

\] seconds. For practical purposes, it's like saying that the time of the calculation has to be finite. It's almost no condition at all. The polynomial condition only says that "in principle", for some very high values of \(n\) that may be insanely high indeed, the algorithm becomes faster than the straightforward brute force algorithms that take the non-polynomial (i.e. exponential or factorial) time. And the number of candidate algorithms is basically infinite. Does it make sense that all "clever ideas" in algorithms may be classified, similarly to the finite groups, so that "all clever ideas in algorithms may be classified and taken under control and programs are always some combinations of theirs", or is the "amount of infinite sequences and kinds of cleverness" infinite? The word "cleverness" isn't even well-defined and even if it could be, we're clearly nowhere near classifying "all possible algorithms", a problem that is quite certainly harder than the problem to classify all finite groups (because the algorithms do include the algorithms to deal with the lists of finite groups).

If there's no proof that such an algorithm doesn't exist, it's sensible to believe that such an algorithm does exist. It exists in principle which is extremely different from its existing in an actual book written using \(\rm\TeX\). There are just so many candidates for such an algorithm – why should all of them fail, Knuth asks?

Again, for me, the question about \(P=NP\) is similar to an indeterminate form \(\infty-\infty\). "A bunch of all clever ideas that are possible in mathematics" is standing against a "problem that looks almost infinitely difficult to many people". The result may be positive infinite, positive finite, zero, negative finite, negative infinite and we just don't know. But I am confident that it's right to say that the \(P\neq NP\) cultists basically deny that the possible "warriors" supporting \(P=NP\) are basically infinitely powerful as well – and the mechanisms making them systematically overlook all the considerations such as those by Knuth are almost entirely about group think and not legitimate rational evidence.

To summarize, I still believe that \(P=NP\) or \(P\neq NP\) may reasonably go in both ways, there exist consistent stories in both directions, and it's plausible that the proof of \(P=NP\) will have the form making the validity of \(P=NP\) "totally obvious". At the same moment, even such a proof is very likely to have no immediate implications because the difference between "polynomial" and "non-polynomial" is largely artificial and, in general, it's different from "fast in practice" and "slow in practice". And the existence of the algorithm won't tell us what the algorithm looks like. And even if we could find an algorithm that writes the code for "fast" solution of \(NP\) problems, the resulting code could be too huge or otherwise impractical.

Knuth's story why he thinks that \(P=NP\) seems overwhelmingly likely is sort of detailed because he studies quantities that the \(P\neq NP\) believers haven't even dared to consider – like the number of mathematically possible algorithms and their combined power. Knuth looks at the "space of possible problems as well as algorithms" from above, a viewpoint that the \(P\neq NP\) cult basically forbids you to do because you're supposed to obediently sit in the mud and worship the "overwhelming" difficulty of the harder \(NP\) problems – you are obliged to pray to the divine traveling salesman who fly above your head in the aircraft, gadgets you have no chance to ever understand. ;-)

On the other hand, Knuth's argument isn't a rigorous proof, either, and it's plausible that \(P\neq NP\) and even that a proof of \(P\neq NP\) is found – while it may avoid the discussion of the concepts considered by Knuth, like the number of complicated algorithms. You know, even though the set of possible algorithms is infinite, one may show that none of them may find nontrivial factors of a number such as \(2^{74,207,281}-1\) (the largest known prime at this moment). Even the combined power of infinitely many algorithms is insufficient simply because we know that this number is prime so no answer can exist. If such an answer can't exist, there can exist no algorithm that finds this answer, either. ;-) In mathematics, one small David (well, this one required some weeks of CPU time) can often beat an infinite army of Goliaths. This is simply how mathematics works.

The group think displayed by folks like Aaronson is the most worrisome observation in this story. In some corners, people are literally rated by their desire to parrot some beliefs – although the beliefs aren't supported by quantitative evidence or careful analyses of the situation and although the upvoted parrots often make no intellectual contributions of their own at all. At the beginning, I mentioned that Aaronson is a classic collaborator of KGB who wants to harm the name and lives of those who are inconvenient for their pet left-wing ideology. But sadly, I think that people like Aaronson have the same instinct when it comes to scholarly issues, too.

They want to attach cheap labels to everyone whose conclusions they don't like. And they also want to talk in terms of "consensus". Well, I don't believe that consensus matters in science. And I don't even believe that there is a majority (at least not a clear majority) of the computer science folks who are convinced that \(P\neq NP\) is nearly certain. This talk about "consensus" is pure propaganda meant to influence gullible readers, just like Aaronson's remarks about notorious conservatives.

Intelligent readers should dismiss or ignore such things and if they're MIT or UT Austin students, they should spit into Aaronson's face.

No comments:

Post a Comment