Tuesday, May 07, 2013

Short questions often require long answers and proofs

Several debaters as well as complexity theorist Boaz Barak religiously worship their belief that it must be that \(P\neq NP\) and that the question whether the proposition holds is extremely deep because \(P=NP\) would revolutionize the whole world.

Most of their would-be arguments are examples irrational hype, fabricated justifications of the limited progress in a field, and group think. I will primarily focus on a single major wrong thesis they promote, namely the idea that a mechanical or polynomially fast or efficient algorithm to solve a problem specified by a short prescription must be short, too.

So let me begin.

\(P\) and \(NP\)

\(NP\) is the class of problems whose solution, if you're told one, may be verified to be correct by \(O(C\cdot N^\ell)\) operations or fewer (the number of operations is effectively the time!) for some \(C,\ell\) if \(N\), an integer specifying the number of objects in the problem or its size, is asymptotically large.

\(P\) is the class of problems whose solution may be found, and not just verified, after \(O(C\cdot N^\ell)\) operations. Verifying a solution can't ever be harder than finding one so we see that \(P\subseteq NP\) but we don't know whether \(P=NP\) or \(P\) is a proper subset.

There are lots of examples of \(NP\) problems that don't seem to be in \(P\) – but, with some surprise, they could be. The provably "most difficult problems in \(NP\)", the so-called \(NP\)-complete problems, are those whose fast solution could be translated to a fast solution to any \(NP\) problem. So \(NP\)-complete problems are on the opposite (hard) side of the \(NP\) set than the \(P\) problems if this "polarization" can be done at all.

Examples of \(NP\)-complete problems include the the clique problem i.e. the decision whether a combinatorial graph has a complete graph with one-half of the nodes that are completely connected with each other; the travelling salesman problem (find the shortest closed route that visits all nodes/cities in a graph), and so on.

I think that a physicist would find the computation of the permanent (the determinant-like sum of products over permutations in a matrix but without the minus sign) to be a more natural representative of the not-in-\(P\), difficult, class because it's more "explicit" – it's given by a compact formula. However, it's not in \(NP\) – there's no way to quickly verify the resulting value of the permanent, either: an example of the fact that \(NP\) is a constraining, narrow-minded class. On the other hand, it's known how a fast calculation of the permanent may be used to solve \(NP\)-problems; it's known that the permanent is \(NP\)-hard (which is a larger class than \(NP\)).

Computer scientists' most favorite difficult \(NP\)-complete problem is probably 3SAT or SAT – to decide whether a proposition constructed from logical operations AND, OR, and binary variables \(x_i\) (obeying a certain constrained format, if you add the digit "3" or another one at the beginning) is a tautology (a proposition equal to TRUE regardless of the values of variables \(x_i\)) but faster than going through all the \(2^N\) possible choices of the truth values of the variables. This very preference favoring 3SAT already shows some "unnatural bias" – they're focusing on things that are easier to write on the existing computers (including the preference for binary computers instead of e.g. ternary ones that we could be using equally well) which is not really a good, invariant criterion for the mathematical depth.

A reason why all these – effectively equivalent (equally hard, convertible to each other with minimal losses) – problems probably don't have an exponentially fast resolution (why they're not in \(P\)) is that at least for some of them, you would expect that if a solution exists, it would already have been found. So you use the same "mathematical induction" – if New York skyscrapers haven't been demolished by an airplane in 100,000 previous days, chances are that it won't happen tomorrow, either; instead, there may be a law that this will never happen and this conjecture has passed some somewhat nontrivial test. Such an argument is risky but it has at least some logic. Or at least, people would say it had before 9/11.

However, many complexity theorists prefer completely different arguments – or modifications of the valid yet inconclusive argument above – which aren't rational.

The key problem why \(P=NP\) wouldn't be far-reaching even if it were true is that the algorithm to quickly solve 3SAT could be "fast enough" so that it obeys the polynomial criterion but it could still be "slow enough" to make it completely impractical. The number of operations could go like\[

10^{100}\times N\quad {\rm or}\quad N^{100}

\] or some other function. The first Ansatz above is linear – very slowly growing with \(N\) – but it has a large coefficient. The second Ansatz has a modest coefficient, namely one, but the exponent is large. Already for \(N=2\) or \(N=10\), respectively, the two expressions above give you at least a googol and no computer will ever do this many operations.

One could conjecture that mathematics isn't this malicious and both the exponent and the prefactor are likely to be of order one for all problems; a physicist would call this argument "an argument from naturalness". However, the experience with many highly natural problems in mathematics suggests that this application of "naturalness" fails much more often than many people might expect.

Let me emphasize that even if \(P=NP\) and a fast algorithm to solve the 3SAT problem or even to compute the permanent only takes \(N^4\) operations, a natural polynomial scaling, there is no contradiction. The discovery of this fast algorithm would make a significant number of calculations of the "clique" or "traveling salesman" type vastly more efficient but the rest of the world would stay the same. You couldn't automatize the search for proofs or deep discoveries in maths or physics, music, and so on because these activities haven't been reduced to 3SAT and not even to the harder permanent without "big losses".

But in the rest of the text, I want to provide you with some evidence and context showing that \(P=NP\) is still very likely to produce impractically slow algorithms in practice.

Forums: short questions, long answers

If you were ever answering questions on forums such as Physics Stack Exchange, you must know that a very short question often requires an extremely long answer. Well, questions such as "explain string theory to me" may require hundreds or thousands of pages and people usually don't try. But even if the question is realistic, a long answer is simply needed.

But we may talk about a more rigorous incarnation of the same question: the length of proofs needed to prove a theorem. Half a year ago, John Baez wrote about insanely long proofs and about the way how some of them may dramatically shrink if we're allowed to assume the consistency of an axiomatic system etc.

However, I don't want to play these Gödelian games because in most cases, you end up talking about propositions you wouldn't be interested in to start with – propositions that are not always human-readable and that are only interesting because of the length of the required theorems. On the other hand, even if we talk about very practical, important math problems, we often end up with the need to write down very long proofs and very long books on the classification of things.

Before I will mention a few examples, it is important to mention that they're examples of proofs and classifications that we have already found. There may exist many interesting theorems with even longer proofs, perhaps much longer proofs, and we haven't found them yet – partly because it becomes harder and harder to search for longer and longer proofs. So it's totally plausible that extremely long proofs of interesting theorems are almost omnipresent in maths. In fact, it seems sensible to assume that the average length of the proof that the mathematicians and others are finding or mastering is an increasing function of time.

I am talking about proofs but the same comment may apply to speedy algorithms to solve certain problems. These algorithms may still be extremely complicated and the fact that we haven't found them yet doesn't really prove that they don't exist. This caution is promoted in the Wikipedia article on \(P\) vs \(NP\). While it offers us the would-be impressive slogan by Scott Aaronson,
If \(P = NP\), then the world would be a profoundly different place than we usually assume it to be. There would be no special value in "creative leaps," no fundamental gap between solving a problem and recognizing the solution once it's found. Everyone who could appreciate a symphony would be Mozart; everyone who could follow a step-by-step argument would be Gauss...
(I've already explained that this fairy-tale suffers from many serious illnesses: one of them is that it completely overlooks the fact that the main ingenious thing about these famous men is that they can find – and not just follow – some de facto step-by-step procedures to achieve something) it also quotes two researchers as examples of those who believe that it's wrong to be biased and the experts in that field should comparably eagerly study the possibility that \(P=NP\) and try to search for proofs of that:
The main argument in favor of \(P \neq NP\) is the total lack of fundamental progress in the area of exhaustive search. This is, in my opinion, a very weak argument. The space of algorithms is very large and we are only at the beginning of its exploration. [...] The resolution of Fermat's Last Theorem also shows that very simple questions may be settled only by very deep theories.
—Moshe Y. Vardi, Rice University

Being attached to a speculation is not a good guide to research planning. One should always try both directions of every problem. Prejudice has caused famous mathematicians to fail to solve famous problems whose solution was opposite to their expectations, even though they had developed all the methods required.
—Anil Nerode, Cornell University
Very wise. Aaronson really represents the unlimited prejudices, retroactive rationalization and hyping of these prejudices, and even bullying those who are showing that the opposite possibility is also compatible with everything we know and who are working to add some progress in that direction. Aaronson's approach to \(P=NP\) is a similar skewed approach to science as the climate alarmism or Lysenkoism – after all, Aaronson has openly come out of the closet as a climate alarmist himself, too.

I need to emphasize that in physics, we also claim that various marvelous hypothetical engines – like the perpetual-motion machines – don't exist. But in physics, we actually have a much stronger case for such no-go theorems. We can really prove the energy conservation from the time-translational symmetry of Nature, via Noether's theorem. We may prove the second law of thermodynamics in the form of the H-theorem and its variations. These results only depend on assumptions that have been heavily tested in millions of experiments, assumptions that we really consider the most rock-solid foundations of all the natural science. Aaronson's (and not only his, of course) prejudices are rooted in no comparably solid foundations.

Fine, now the examples of the long proofs.

Fermat's Last Theorem

I have discussed some history of the proof last month. The statement of the theorem is extremely easy to formulate. Almost all the well-known mathematicians of the last centuries spent hundreds of hours per capita in attempts to prove the conjecture. At most, they succeeded for some individual exponents \(n\).

Suddenly, in the 1990s, Andrew Wiles succeeded. Many people had already switched to the lazy philosophy saying that "if it hasn't been found for several centuries, it can't be found". They were proved wrong. Wiles' proof was rather long and, even more importantly, it depended on highly advanced, hierarchically abstract concepts that "seemingly" have nothing to do with a simple statement about integers, their powers, and their sums.

In this random episode, Pat and Mat try to bring a painting through the door and hang it in the living room. They spent quite some time with it – more than 8 minutes on the video – and tried many methods which still doesn't prove that they have proven that no better solution exists.

In some formal language, Wiles' proof would still require at least thousands of characters. A brute force search for proofs would require at least something like \(O(256^{1,000})\) operations, if you understand me, but even if \(P=NP\) were true and allowed us to search for proofs polynomially quickly, the power could still be what it is in \(1,000^8\) which would make it impossible to find the proof in practice. The hypothetical proof of \(P=NP\) would still be very far from providing us with proofs of every theorem or a fast solution to any problem.

And of course, all the comments about Wolfgang Amadeus Mozart, Bobby Fischer, or Carl Gauss are completely silly and unrelated to the technical question above, except by verbal tricks. After all, these and other ingenious men had superior skills due to some kind of a brute force that others didn't receive from God. It may be great to believe that something divine, qualitatively different, intrinsically creative is behind these men's superiority but it ain't so. Almost everyone has gotten some memory, CPU, and creativity when he or she was born but some people just get much more of it and it – along with some training, hard work, and lucky circumstances – allows them to do things that others apparently can't do or at least don't do.

Four Color Theorem

The four-color theorem says a very simple thing: areas on a map may always be painted by four colors so that no two adjacent areas share the same color.

This is how you may paint the U.S. map with four colors.

Sometimes three colors are simply not enough. Divide a disk to three 120° wedges. They need to have 3 distinct colors. An extra area surrounding most of the disk touches all the three wedges so you need a fourth color.

And five colors is safely enough; it was proved in the 19th century and the proof is elementary.

However, four colors are enough, too. But there is almost no "wiggle room" left. Consequently, the existence of a four-colored painting of the map often looks like a good luck. Equivalently, the known proof is very complicated.

The first proof was constructed with the help of computers. They needed to verify thousands of possible "submaps". In 1996, the number of submaps that require a computer-enhanced verification was reduced to 633 but it's still large. It's been argued that no human can really check the proof by himself.

Boaz Barak tries to argue that 633 may perhaps be a bit larger than one but it will decrease to a number of order one. Well, it doesn't have to. It may be that 633 or 450 will be proved to be the minimum number of submaps that need to be verified in a similar proof. It's still an extremely large number.

Moreover, we're talking just about the four-color theorem, a kindergarten kid's playing with pastels. It seems pretty much obvious to me that more structured problems will require a much larger number of individual cases that have to be checked by the brute force. For example, there may exist an algorithm that searches for length-\(N\) proofs of a theorem in a polynomial time, \(C\times N^\ell\). Such an algorithm may still require an astronomical large memory for the algorithm itself and/or an astronomically large (either smaller or larger or the same) memory for the intermediate data. The fact that we don't know of such an algorithm today doesn't mean that it doesn't exist.

On the other hand, it's still plausible that the four-color theorem also has a different, elementary proof. Two years ago, I was sent something that looked rather promising by a chap and spent many days with that. At the end, there was nothing that looked like a proof. Today, I would probably be able to see that "it can't possibly be a seed of the proof" much more quickly – simply by seeing that too elementary a proof can't possibly know about the "wisdom" or "good luck" that are needed to four-color some difficult enough maps.

Classification of finite groups, largest sporadic groups

A group is a set with the associative multiplication\[

(fg)h = f(gh), \quad f,g,h\in G

\] and with a unit element and inverse element for everyone. A group may be a finite set. How many finite groups are there? Of course, infinitely many, e.g. the cyclic groups \(\ZZ_n\) already form an infinite subset. Moreover, you may always construct Cartesian product groups and other constructions. Can you list all the mathematically possible groups in terms of some well-understood building blocks such as \(\ZZ_n\), \(S_n\), \(D_n\), \(A_n\), and their direct or semidirect products and/or a few extra operations?

Just for two decades, we have known the answer to be Yes.

The achievement – the classification – is written as the union of hundreds of mathematical papers by a hundred of authors. Thousands and thousands of pages are needed to solve the simply formulated problem above. Aside from the easy-to-understand groups above – some of which are coming in understandable infinite families – the classification forces you to discover 26 truly unusual examples of groups, the sporadic simple groups. (Sometimes the Tits group, albeit a bit simpler, is counted as the 27th sporadic group.)

The largest sporadic group is the monster group. Its number of elements is about \(8\times 10^{53}\), almost a million trillion trillion trillion trillions. It's a very particular, fundamental, and important integer in maths that arises "out of nothing". The dimension 248 of the largest exceptional Lie group, \(E_8\), is analogous except that the number of elements of the monster group is vastly larger.

The laymen could also intuitively guess that objects described by very large integers such as this one are contrived, unimportant in practice, and so on. But if you read some articles about the monstrous moonshine, you will see that just the opposite is true. The largest sporadic group is, in some sense, the most elementary sporadic group. An algebra of vertex operators that respect this huge finite symmetry carries an explanation for the coefficients in the expansion of the \(j\)-function, a very natural function (basically unique if holomorphic) mapping the fundamental domain to a sphere. The AdS/CFT dual of the conformal field theory with the monster group symmetry is the "simplest" gravitational theory you may think of, the pure gravity in the 3-dimensional AdS space, at least at the minimal curvature radius.

The latter relationship suggests some kind of a duality. If the CFT has some group with many elements, its dual is simple and has a few fields. There is some complementarity between these two things and the product could be something like \(10^{54}\), if I describe the relationship in a "moral" way that isn't quite accurate.

When we approach some really natural structures in maths – especially the deep maths that has links to the foundations of quantum gravity etc. – we easily and naturally get prefactors that may be as large as \(8\times 10^{54}\) or other large numbers. Of course that if an algorithm gets slowed down by this factor, it doesn't matter that it's still "polynomial": it will be impractically slow and useless in practice.

And those things surely do happen in maths often. Even though our current knowledge is almost certainly biased towards knowing the shorter proofs and methods and be ignorant about the longer ones (because the latter are harder to be found), we already know that there are many important proofs and algorithms and classifications that are extremely long and hierarchically complex.

The string-theoretical landscape

String theory is the richest theory that the mankind knows and one may enter the realm through many entrances. The evidence strongly suggests that it's a completely unique structure with no deformations or siblings. However, it may have many solutions.

In the class of semi-realistic stabilized flux vacua, people have tried to estimate the number of solutions to string theory's equations. The "stabilized" adjective implies that there are no moduli left; no continuous parameters can be freely adjusted. With these extra conditions, the number of vacuum-like solutions has to be finite or countable and if one imposes some additional semi-realistic criteria, he ends up with a finite number. But the number is often quoted as \(10^{500}\) although this estimate is far from a rigorous enough calculation.

At any rate, it seems very likely that powers of a googol are a good estimate of the number of string-theoretical vacua obeying certain very natural (and sort of realistic) constraints that make them very promising as descriptions of the real Universe around us. Whether there exists a principle that picks one of them or some of them according to some non-anthropic criteria remains unknown.

But even if we ignore these vacuum-selection and anthropic questions, it seems true that string theory produces a high number of solutions. If you accept that string theory describes the laws of physics in the most general sense, you may ask whether the laws of physics allow some particular phenomena or creatures. Describe the creature in some way. To find out whether they appear anywhere, you may be ultimately forced to search through those \(10^{500}\) different universes.

The \(P\neq NP\) complexity theorists usually talk about a few dozens of their pet \(NP\)-complete algorithms, not about the problems for "mature scientists" such as the searches for some stringy vacua with some properties. So they think that the traveling salesman and equivalent problems are "almost everything" that is worth thinking about. Well, it surely ain't so. But in the "traveling salesman" type of problem, they know what \(N\) is.

What is \(N\) in the case of the problem "Find a stringy vacuum with some properties"? Needless to say, really fundamental physics problems usually don't come with instructions such as "use the label \(N\) for this and that". In some sense, we are dealing with a single string theory and the finite number of solutions is just an order-of-one extra complication. In some counting, we have \(N=1\). Nevertheless, it makes the search \(10^{500}\) or so times more time-consuming, at least if you have to rely on the brute force method (which isn't clear). It's an example of the huge prefactor that Boaz Barak believes not to occur in practice. It surely does occur – it occurs in the case of many fundamental math problems such as those above as well as in the case of the most natural and fundamental problems in physics, those asking about the basic properties of string theory.

More generally, I want to say that there are lots of "subfields in maths" that may be completely mastered – in the sense that you may turn a previously seemingly creative activity to a mechanical process – but the number of concepts and algorithms that you sometimes need to learn to make these things mechanical is often very large. Think about all the tricks and rules you need to know to integrate large classes of functions or anything else. Think about the huge amount of code hiding in Mathematica which allows you to solve so many things. Mathematica and Wolfram Alpha are examples of projects in which people actually made a significant progress in their efforts to "automatize" many things. While I find it unlikely that Wolfram or others will ever find a gadget that "automatizes everything" – e.g. the tasks "find the shortest proof of a valid proposition" or "find the fastest algorithm that solves something" etc., I don't really have a proof (and not even legitimate, strong enough evidence) and it's just wrong to promote big ideologies that try to "frame" everything so that it's compatible with the predetermined conclusions even though the truth may be different and the alternatives might be "framed" as well, if you tried a little bit.

I feel that people such as Boaz Barak are extremely narrow-minded in their focus on 3SAT and several other, comparably down-to-Earth, problems; extremely biased against the existence of more sophisticated algorithms and proofs than they know now (partly because they don't really want to learn something difficult and new and they don't want to be shown to have been unable to find out something remarkable that they should have found if they were ingenious enough); and they spend way too much time by rationalizations of the limited progress in the search for much better algorithms to solve various things.

Moreover, their experience with the really nice and important problems that have some "beef" is limited, otherwise they couldn't possibly claim that large prefactors and very long minimal proofs or algorithms can't appear in maths. They surely can and they often do. At least in this sense, "naturalness" (in the sense of the preference for order-of-one coefficients in answers to questions) fails in maths: the number of concepts and the amount of time you need to solve a short problem formulated as a length-\(N\) string is often much larger than \(N\) or a small multiple of its modest power. As Barbie correctly said, math class is tough. But it should still be taught at schools.

Mathematics is a complicated network of things that are easy, things that are hard, things that look hard but they're easy, things that look easy but they are hard, and the hardness has very many aspects. The idea that all problems deserving to be a part of maths may be divided to easy and hard ones in a binary way is a childishly naive concept. Very simple questions often require one to discover and (in the case of others) master very complicated bodies of knowledge to understand the answers. And very rigid and in this sense simple laws – like, in the most extreme case, the laws of string theory – are often able to produce an amazingly rich set of phenomena, implications, and relationships, i.e. pulsating and intelligent worlds with lots of fun. Complexity theorists who effectively assume that solutions and straightforward algorithms have to be as short and easy-to-learn as the questions themselves are effectively assuming that "what comes out had to come in". But this just isn't true in maths and science. It's the whole point of physics, for example, that we can explain diverse things with a limited canon of laws – but these in principle simple laws have many consequences and create many patterns that have to be understood if you want to master the implications of the laws.

The idea that "proofs and solving algorithms are about as short and as deep as the formulation of the problem" may be described as the "garbage in, garbage out (GIGO)" paradigm and this is what the research may easily look like if you insist on similar prejudices.


  1. The permanent is certainly not in the class NP...

  2. Thanks, fixed. But it's stronger than NP-complete, a calculation of the permanent would solve the NP-complete problems.

  3. As you, Lumo, might expect from me, I think you are the clear winner of this argument by how you aptly encroach [on and make use of] psychologists' t theoretical and terminological territory. ;-)

  4. There are also different different notions of complexity which are independent.

    Kolmogorov complexity, for a bit string, is the size of the minimal program which is able to generate the string.
    Charles Bennett logical depth, for a string, is defined as the execution time of its minimal program.

    Practically, if we work with images, for instance, Kolmogorov complexity is the size of the zip file, while Charles Bennett logical depth is the time needed to unzip the file.

    For instance, a completely aleatory image has a big Kolmogorov complexity (the zip file is big), but a little logical depth (it is not long to unzip)
    A completely black image has a very little zip file, and can be quikly unzipped.
    A image with structures has a big Kolmogorov complexity, but also a big logical depth.

    (For french readers, see the article by Jean Paul Delahaye in "Pour la Science", May 2013)

  5. Right, thanks.

    But you must understand that the degree of compression that a ZIP-like compressor achieves is still very primitive. It's efficient for the kind of files we normally want to compress. But they're no deep maths. They're just simple algorithms to compress files by noticing the different frequency of different characters, sequences, or colors in the file.

    An omnipresent compressing algorithm could e.g. take these pictures


    and write the short Mathematica code that was needed to produce the graph of the Ramanujan functions above. Some things may be compressed vastly more efficiently if you have some deep knowledge and make some guesses - that are unlikely to be made completely by chance.

    It's exactly the idea that simple programs such as ZIP compressors are representative of the relationships between "input" and "output" or between "problems" and "solutions/algorithms to solve" that I am fighting against. In the actual maths with everything we have already understood, the relationships between the two sides are much more complex, hierarchical, and technological sophisticated than a real-world ZIP compressing program. Correspondingly, maths often leads to a much vaster asymmetry between the length as well as sophistication of the two sides.

  6. Thanks, Peter, this is always appreciated from you! ;-)

  7. SAT is about satisfiability (is there at least one assignment of values to variables that would make the formula true?), you are describing tautology instead, that is a co-NP-complete problem.

  8. As conventional computers only have a finite number of states, I wonder if one could somehow use the fact that quantum fields have an infinite number of degrees of freedom to devise a completely new class of (quantum) computers. Maybe these could even tackle NP hard problems. Is this idea flawed ? Has anybody already dwelled on that ?

  9. Hi, thanks. I didn't write anything about the precise form of the propositions in this class (and the allowed quantifiers and their locations), and at this unspecified level, the problems "satisfiability" and "tautology" are completely equivalent.

    "Proposition P is satisfiable" is the same thing as "not [not(P) is a tautology]".

  10. Lubos - Of course there are things for which tremendous compression ratios exist or may be found, but for the ordinary run of things there's only so much redundancy to exploit. Are you suggesting that's wrong?

  11. Well, the P vs NP problem is more or less the opposite of the title of this entry, is it? I mean, it is about if any long question (long in the sense that the set of options to answer grows faster that any polynomial) with a short answer (verifiable in polynomial time) has also a short way to find such answer solution.

    P=NP amounts to say that "If a long question has a short answer, the discussion needed to examine the question and find the answer is also short"

  12. I think you're getting hung up on language, for you "[lossless] compression" = "identify redundancies to shrink the size of the data file". Another meaning that I think S. Wolfram and Lubos are ascribing to "compression" is, "try and see if this seemingly chaotic, wild and complex object with very few obvious redundancies is in fact the outcome of a set of rules that can be written on one sheet of paper but that yield amazingly rich structures when executed". But I'll let Lubos answer for himself.

  13. "highly advanced, hierarchically abstract concepts"

    Nice phrase. Of course we find layered abstractions in Kant, Heidegger, and Sartre too. But with them the trail of crumbs isn't enticing enough to lead one on: one doubts that there will be any treat at the end of the journey. Not so in maths and physics we know now. Or rather Lubos knows and I trust that he knows.

    One of the pleasures of reading this blog is following the peregrinations of a superior mind which, one feels at heart, does not belong to a bullshitter.

  14. Sometimes the Tits group, albeit a bit simpler, is counted as the 27th sporadic group.

    Is that what I think it is? :)

  15. Your last paragraph sums up my feeling as well.

  16. "If a long question has a short answer, the discussion needed to examine the question and find the answer is also short"

    Quite right: https://www.univie.ac.at/Anglistik/easyrider/data/WhartonJames.htm

  17. Mr Barak, only people who don't have arguments build their case on bullying others and screaming about a "consensus". This stuff - intimidation and "consensus" - has no place in legitimate science and you should be deeply ashamed for having used it in similar debates. I am saying it despite the fact that I find P ≠ NP to be more likely, too. But most of the other things you wrote in your blog entry is rubbish.

    Well, I may have contributed to the Wikipedia page about string theory but I prefer not to look at such things because chances are high that they would make me upset. I am saying it with the knowledge that most of Wikipedia seems extremely useful and mostly valid to me.

    I am not confusing anything. On the contrary, I am emphasizing that people like you completely overlook all the other aspects than the running time - which is the least interesting quantity in the list from a mathematical viewpoint.

    Totally agreed that the four-color theorem or some particular primality testing algorithms aren't as deep as Wiles' proof of FLT. FLT has closer to RH and other math results for "adult" mathematicians.

  18. I do not believe that P ≠ NP because it's the consensus. I believe it because all the evidence we have points to it being true. But I felt it was too much of a digression to start to list and explain all the evidence, especially if you already agree that most likely this is the case. If you're interested in this question, Scott had a blog post about it which is a good starting point (though even it doesn't attempt to cover all the evidence we have for this question).

    "Depth" is a subjective term, and I do not want to start arguing which results are deeper, since I don't think it's productive. My point was merely that it's possible (and indeed actually happens) that an algorithm for a computational problem requires a lot of ingenuity, creativity, and mathematical depth to invent and/or analyze its correctness, but its running time is still fairly efficient.

    So, the fact that I doubt the existence of an n^{1000} algorithm for SAT is not because I think all algorithms must be simple.

  19. Dear Smoking Frog, I don't think that anything is wrong with compression algorithms we're using. They're doing some work, like coffee machines do.

    I am just protesting against the suggestions that ZIP compressors, an algorithm to semi-solve the traveling salesman, or any other particular algorithm like that describe almost all of human knowledge or something of the sort.

    Just like coffee machines, they're extremely tiny percentages of the human knowledge, know-how, mathematics, and/or science. There's no way how the ZIP compressor that is useful for the "ordinary runs of things" could know much about some completely different questions such as the four-color theorem, Riemann Hypothesis, confinement in QCD, string theory, or the black hole information puzzle. It just doesn't.

    These algorithms or any other toy model algorithms or a union/class of 100 such algorithms that people in a discipline study is still a supertiny fraction of the human knowledge and in no way it contains the rest. From the viewpoint of ZIP compressors, almost all the things we need to solve/explain/compress (into theories) in the rest of maths and science are "extraordinary runs of things" and the compression algorithms for the "ordinary runs of things" are just totally useless and inadequate.

  20. Dear Eugene, if you don't add any extra attribute to the "two definitions" of a compression in between the lines, they're totally equivalent!

    It's really the point. If you're allowed to talk about redundancies that are expressed or encoded in the most general way, you will realize that the removal of all such redundancies must be interpreted as the more generally sounding "write a shorter program that generates this as an output".

    The only difference is between the lines. In the "first definition", you're just assuming that only some "immediately visible", well-known redundancies are taken into account while shortening the file. In the second one, you're imagining that the program is doing something much more sophisticated and intelligent, dependent on the content and its internal correlations etc.

    But there's no "qualitative gap" between the two definitions. As you go from the first understanding to the second, you're just gradually increasing the sophistication of the theories and/or compression algorithms.

  21. LOL, I don't want to claim that I know what you think! ;-) But if you allow me to guess what you think :-), then the answer is No, it's named after mathematician Jacques Tits who found it in the 1960s.

  22. No, Alejandro, the title is about a key argument why P isn't NP, and the argument is really an assumption that if there exists a polynomially speedy algorithm for the NP-complete problems, its running time must scale like a small number times a small power of N which would mean a total revolution in [many things]. I listed reasons to think that this may not be so because the numbers that are not of order one routinely appear in important math problems.

  23. I have been following this blog for a while because I want to hear the other side of the argument in regards to global warming. I thought that a theoretical physicist would have some useful insight into what is really going on.

    However after reading this article I think I will not trust Mr. Motl with anything other than physics. He has a very simple understanding of complexity theory. Even an undergraduate would be able to see the silliness of what his claims are.

    No computer scientists actually thinks that if we find that P=NP we will be able to solve all problems very quickly. I don't know of anyone that would take the claim that someone that understands Mozart would be a Mozart literally. It is obviously an analogy to try to explain the concepts to people with a less technical background.

    Mr. Motl speaks of going against the consensus and that is fine. But I'm afraid that Mr. Motl only goes against the consensus simply to go against the consensus.

    I think this post pushed me a little bit further into believing the claims of the climate scientists because if his analysis of complexity theory is this bad, then I cannot trust his analysis of climate change to be any better.

  24. Well, Scott Aaronson surely said that. One may debate to what extent he meant it literally - probably not literally about the actual Mozart etc. but he surely meant it as an argument that P=NP would make the world a totally different place because it would mean that the boundaries between ingenuity and averageness would be erased. I have demonstrated that this ain't the case. If you don't understand the point, then I am sorry.

    I never go against consensus for the sake of it. And in fact, if you analyzed my opinions carefully, you would see that my opinions are the closest match for the top mainstream experts in almost all the discussed disciplines among virtually all participants of these debates.

  25. I would like to read Mr. Anderson's post do you have a link?

    As you explained polynomial time doesn't imply accessible to us. If the leading coefficient is large or the exponent is large then we will not be using the algorithm. This is known even to undergraduates and it's not something mysterious. I find it hard to believe that Scott Anderson got a PhD in Computer Science without understanding this fact. I find it more likely that you are misrepresenting what he is saying.

    The P=NP problem doesn't deal with ingenuity and creativity so I don't seriously think that anyone is seriously trying to make that connection in complexity theory. Sure we might have dreams and wishes because it's not hard to think of the human brain as a computing devise and hence have a link to creativity, but that's not what is pushing the research ahead.

    Proving P=NP might not have world changing implications in the average person's life, but it would have big implications in complexity theory. I can see that this what Mr. Anderson probably meant. Also depending on the proof, it could have larger implications on the world. We just don't know.

  26. Hi Eric, his name is Aaronson, not Anderson. The Wikipedia quote appears as the "philosophical" argument in this 2006 post of his:


    He wrote a whole book with many similar claims recently


    and got not only PhD but also a tenure at MIT despite his misunderstanding of many facts about the structure and foundations of the human knowledge. After all, many of those who gave him the tenure misunderstand these things as well.

  27. I can see how that could have misunderstood. What he means is not that if we prove P=NP then we would have this equivalence between understanding Mozart and composing like Mozart. What he is saying is that if P=NP then why hasn't evolution figured out how to solve NP problems in P time?

    You could say that it might be because the solutions to those problems have large coefficients or large exponents. However that would mean that it's just conincidence that the NP-complete problems have large coefficients and large exponents in such a way to prevent evolution from finding them useful.

    I did not sense that he is trying to prevent people from proving that P=NP. His language is a little toungue in cheek with the whole allusion to "the faith" but that's it.

    Could I ask why you focus only on his 8th and 9th arguments? I think I would have found your article more useful if you had mentioned that he had 10 arguments and you didn't agree with 2 of them but the rest were ok.

  28. If you have followed then you would realize that you cannot really support your attitude. I think you should assert a specific point and basis. I am sure undergraduates are taught much that is wrong so how shall I know what undergraduates will say is silly and why does that matter ?

  29. "I think this post pushed me a little bit further into believing the
    claims of the climate scientists because if his analysis of complexity
    theory is this bad, then I cannot trust his analysis of climate change
    to be any better."

    Eric, I want you to know that you just made someone's day.

  30. Great Scott! Just in case that would also make your day it is a reference from the Rocky Horror Picture Show and not necessarily an endorsement. More seriously I do think that the dispute with dr. Motl is the most educational stuff available to those of us who never did when young and will otherwise never be able to learn from such great minds. I have now one year followed TRF and have for thirty five years after my undergrad days spent too much time in ratio to dollars earnings too little in ratio to intellectual wealth studying.
    You and Motl are together with some others like dr Strassler and Susskind elevate the Internet beyond all my youthful expectation

  31. It's Dr Motl. I think it's embarrassing for Aaronson to pick this post as the one that made his day.

  32. I can usually understand everything on this blog but this one was above my head.

  33. Dear Dr Barak, thanks for what I asked you for. I am not sure whether Barry Mazur shares your opinion - we should have discussed these matters with him during the Harvard Society dinners - but what you wrote is just wrong for fundamental reasons.

    First, if one decides about the existence of one object, it is not enough to check "five candidates" what the object could be. The hypothetical polynomial algorithm for 3SAT may be completely different (and more sophisticated) than some five random guesses. It's like you're trying to find whether someone is responsible for the Boston Marathon explosions, you check five people in Boston, they're innocent, and you conclude that the explosion occurred spontaneously.

    Second, the "prediction methodology in maths" as you described it is really wrong because if you assume a proposition that is *wrong*, it implies ("predicts") anything because the system including the original axioms is inconsistent! Even if it is consistent, it just doesn't mean that the assumption follows from the axioms. Notoriously, the Axiom of Choice may be used to quickly imply many results and speed up proofs but it does not follow from the common axioms of set theory.

    Third, I disagree that heuristic proofs are much easier than the rigorous proofs. If one really knows and feels how to solve a problem at the heuristic level, he is essentially done, and the job for a mathhematician who writes neat formulae in TeX to match a rigorous conventional format of a formal proof is pretty much a job for a secretary.

    The point is that there is even no heuristic proof that P isn't NP. The comments you're wrote aren't a heuristic proof. They're a sequence of emotions and logical fallacies.

  34. Probably because it's about what Richard Feynman called filosify. I'm sure he would rate this discussion about as highly.

  35. Eugene - No, I'm not getting "hung up on language." Maybe you are - you're using a narrower concept of redundancy than I was. If the graph of a mathematical function is far larger than the function (written in algebraic notation), the graph is redundant. Is there something wrong with that?

  36. Dear Frog, when you put it that way, no, I can't see anything wrong with that.

  37. Well, I didn't know there were people who thought LZW and the like were a major part of human knowledge. :-) I didn't think Aaronson or Barak went that far. Maybe I should reread your posts and disputes.

    (Man, I hate the word "post" for an article or essay written by a blogger! I get mad every time I "have" to use it.)

    I meant that I didn't think major improvements were likely in compressing the kinds of things LZW and the like are used to compress, no matter by how much human knowledge increases - unless we count "physical redundancy," such as the representation of a bit by a huge number of molecules instead of one molecule.

  38. Yes, a false proposition can imply anything including other false propositions. Therefore, if a false proposition A implies a non-trivial proposition B then a-priori there is some chance for B to be false too. Thus, if you check and see that B is actually true, then it makes you somewhat more confident that A was true.

    The term "non-trivial" above is not precisely defined, and this is the part where it takes some expertise and intuition in the subject matter. Mazur's writeup has a discussion of this matter, and notes that the above approach is one of the ways mathematicians gained confidence in the truth of the Reimann Hypothesis.

    I did not claim I have any heuristic proof for P ≠ NP, nor that it is enough to prove this statement by checking a finite number of candidate algorithms.

    What I said is that P ≠ NP implies that very different types of algorithms will fail to solve NP-hard problems, and these amounts to a number of very different and non-trivial mathematical statements that, if we didn't think that P ≠ NP, we would not necessarily have guessed they are correct. So, when we use the assumption that P ≠ NP as a guide to give us a large family of different mathematical conjectures (one for each potential algorithm), which are actually interesting and non-trivial in their own right, and we are able to verify many of these conjectures, it gives us more confidence that our guide is correct.

    Here is one recent example. In the 1980's there was an attempt to show that P=NP via a polynomial-size linear program for the travelling salesperson problem (TSP). In 1988 Yannakakis showed that the particular LP proposed, and in fact even a larger family of such LP's, can never solve the TSP. He also made the stronger conjecture that there does not exist a polytope defined by a polynomial number of constraints that projects to the TSP polytope. This conjecture was proven in this 2012 paper of Fiorini et al, who used a connection between these kinds of questions and classical and quantum communication complexity.

  39. No, Dr Borak. The point is that a false assumption A - along with other axioms - implies both true and false propositions B! It's because the axioms plus A are an inconsistent set. If you can derive a true B from an A, it says absolutely nothing about the validity of A.

    In physics, or with assumptions A of a continuous type, the situation would be different because A could allow one to *calculate* results - predict continuous values. Then a successful prediction is a nontrivial evidence for A. But it's only nontrivial because the right value of B has many (usually infinitely many) alternative, wrong values of B.

    But if A,B are just binary propositions with the a priori assumed equal chances to be true and false, one can't say anything. The induction of truth values of A from the validity of their implications B is just impossible.

  40. Titurbation theory -- we need to work on that.

  41. Do Aaronson and Barak remind anyone else of Brian Cox, or is it just me?
    One could say (stealing a quip) that their arguments run the gamut from A to B. (to C)

  42. "Titurbation" is an anagram of "attribution," so I've been thinking of selling the rights to the AGW alarmists for use in their obfuscatory work. :-)

  43. Why is it that most of your examples are about terrorism?

  44. Thanks for a very interesting post. This comment is close to the level of trivia, but not quite. 9/11 was not the first time an airplane flew into a NY skyscraper. In 1945, a US B25 medium bomber, the kind used in the Dolittle raid of 1942, hit the Empire State building. I've heard, but not verified, that the World Trade towers were designed to survive being hit by a fully fueled Boeing 707, the largest commercially available plane back then. That design counted on asbestos around the girders, but the use of asbestos was prohibited while the towers were still under construction.

  45. Dear Clayton, I don't understand the proof - or the techniques in this section of maths, for that matter - so please don't expect an expert report.

    The proof only has 5 citations at this point, mostly self-citations:


    It's the first one in the list. So based on this data, I wouldn't expect that it's something that all mathematicians have been working on in the latest year.

    I erased the parenthesis from an URL in your comment which was otherwise broken.

  46. The argument for P!=NP is simply that reversing a general computation "should" require exponential search, because irreversible computation can be rewritten as a reversible compuatation, where you are producing random junk bits as you go along. If you have a computation of N steps starting with initial computer memory state I and producing final state F, you can embed it in a reversible computation starting from initial state (I,0) with the initial state expanded to include order N garbage bits set to 0, and end up in final state (F,G) where the final state of the garbage bits is G. Now you can reverse it trivially. But to reverse it without knowing the garbage bit values, you need to reconstruct the garbage bits from F.

    The point of P!=NP is that to construct the garbage bits for an arbitrary computation are not structured, so to go from F to (F,G) requires an exponential search through a sizable fraction of all 2^N possibilities for G, even if you know the algorithm and you know N. It is completely reasonable, it should require an exponential search, because whatever structure is present on G is going to be more complex than any fixed algorithm, because the structure of G grows in complexity with the forward algorithm complexity and with time (or equivalently, with the initial state F and with time). This is not anything like a proof, because the space of G's are structured by the fact that they are generated via computation, but it should be possible to prove that any algorithm for computing G from F quicker than exponentially can be spited appropriately, by choosing the proper algorithm which would produce garbage bits which the backward algorithm would mispredict.

    This argument is not dependent on any nonsense about "if it was possible, it would already be done", or "the consequences would be appaling" or anything like that. It is a heuristic reason to believe the conjecture, and you should (after understanding the argument) just believe that the conjecture is true. The social nonsense is propped up because the intuitions about why it should be true are hard to make precise, and will not be persuasive until the proof is available.