Thursday, November 05, 2020

Counting of SAT assignments: a parametrically less brutal brute force algorithm

I have often returned to my view that \(P=NP\) has been neither proven nor disproven and both possibilities should be considered "comparably plausible" because any other assumption is just a reflection of someone's prejudice that is both dishonest and dramatically hurting progress in the case that the belief is incorrect.

Donald Knuth has expressed a similar conclusion with quite some details about the philosophical Ansatz. In particular, he has emphasized that the number of algorithms – and complex, extremely clever algorithms that use the degree of cleverness, know-how, and identities that humans don't know and won't know for some time – is huge. Well, the number of algorithms is obviously infinite but the real point is that new and new clever features and identities (which have the power to simplify the solutions) keep on emerging once you allow "longer" algorithms.

As you will see, Giorgio Camerani has quite some computer science background. But he has posted the following beautiful preprint
The Long, the Short and the Random
as an unaffiliated researcher. You may verify that the paper is beautifully typeset and formatted as a perfectionist mathematics paper in computer science. More importantly, it is a great example of the possibilities that the \(P\neq NP\) religious fundamentalists overlook and want to overlook.



Camerani's paper solves a variation of the Boolean satisfiability problem. That refers to the following task: consider the space of \(n\) Boolean (true or false) variables, connect them into some expression using "and", "or", and "not", and try to decide whether there is an assignment of the \(n\) bits for which the formula is "true". OK, simple enough. Camerani solves a more general problem: count for how many assignments the formula is equal to "true". That is a stronger problem because you may solve the original problem by comparing the result of the counting problem with zero. He must assume something about the "randomness" of the formula and the length of the clauses.

The Boolean satisfiability problem is known (has been proven) to be NP-complete. It means that any problem in the NP class may be solved at least as quickly as the Boolean satisfiability problem. It is not known whether a polynomial algorithm exists for either. Here, the new algorithm is unlikely to be polynomial (a polynomial algorithm for an NP-complete problem would be an even bolder claim than a polynomial solution for an NP problem). So the number of steps is closer to an "exponential one" but the exponent may be parameterically smaller than \(n\).



The most naive brute force algorithm simply visits all \(2^n\) assignments of the \(n\) bits, evaluates the Boolean formula, and if the result is "true", it changes the counter. This is the "truly naive brute force" algorithm. It seems fair to say that \(P\neq NP\) fundamentalists insist that this "truly uninspired brute force" method to evaluate the difficult task is basically the only one or the optimal one or, to say the least, the possible speedup isn't game-changing (and perhaps it is not parameteric, depending on how hardcore the \(P\neq NP\) believer is).

But Camerani shows an algorithm that seems faster. Instead of \(2^n\), it takes \(2^{o(n)}\) units of time where \(\lim o(n)/n\to 0\) for \(n\to\infty\). You might imagine that \(o(n)=n^{1-\epsilon}\) for some small positive \(\epsilon\) but a more precise estimate isn't available at this moment. He has presented both a rigorous proof of the inequality (hopefully correct proof) and some experimental and circumstantial evidence that his algorithm takes parameterically fewer steps than the "naive brute force search". You need to read the paper. But the algorithm uses several layers of clever know-how:

* a new combinatorial formula relating monotone sub-formulae to the required (or, on the contrary, violating) assignments
* the insight that "longer translates to shorter": longer formulae actually do lead to some simplification when it comes to the running time
* a brute force treatment that is nevertheless less brute than the naive one

There is a very broad philosophical sense in which these "heretical loopholes" are disliked by the \(P\neq NP\) fundamentalists for the very same reason why they are likely to be outright hysterical about the climate change, the Covid-19 disease, or many other things: They are totally obsessed with the worst-case scenarios and they would like to crucify everyone who isn't!

So now, Camerani still uses the computer to do a rather extensive task for which the computer's brute force is needed. But it seems that he needs much less brute force because of some clever "pre-processing". A large part of valuable algorithms and even the civilization is based on this know-how that still requires a lot of brute force but it is fewer brute force than at the beginning. It is still hard work to pilot an airplane but it is easier than to run 5,000 miles (or to try to fly by vigorously waving your arms as if they were wings). But some clever engineering had to go to the design and construction of airplanes.

Similarly, if a search engine wants to offer you the most relevant pages for a given search, it doesn't have to go through all web pages in the database (on the Internet!) and to calculate a score for each page, before the billions of scores are sorted. Instead, it uses an index plus some database created with the PageRank algorithm (something based on the diagonalization of matrices remembering which pages link to which other pages). This is a nice, clever, multi-million-dollar algorithm that was productively used by a company when it was still useful for the people – and not yet a trillion-dollar computer department of a global radical Marxists' body that wants to conquer the world and abolish the human freedom on the planet.

So brute force is still useful – in computer science, mathematics, physics, natural sciences, and engineering – but lots of clever features, identities, and theorems that are incorporated into algorithms and recipes often make the brute force less back-breaking, sometimes parametrically so. The open question (a question for "engineers") is really quantitative in character: some percentage of the back-breaking labor may be eliminated, what is the percentage that is still left? Unless there is a real proof, there exists absolutely no basis for claiming that "almost all of the brute force will always be left" just like there exists no basis for the statement "climate change or SARS-CoV-2" will basically kill everybody. To do something useful in these fields – algorithms counting assignments, medicine, or energy industry – you just need to get your hands dirty a little bit and work with some possible improvements that are neither "totally eliminating all the risks and work" but they are not "negligible improvements", either. Our sweat glands make most of us survive 1 °C of global warming; antibodies, T-cells, and other parts of our immunity system make some 99.85% of us survive Covid-19 and new drugs may send this number even closer to 100%; and identities linking assignments to subformulae allow us to count the assignments in a generic formula without dealing with every assignment.

Incidentally, Mr Trump, I don't believe you can make a "safe deal about your disappearance" with any politicians now simply because the "Democratic elites" who would negotiate with you are likely to lose the power soon, too.

No comments:

Post a Comment