Thursday, February 22, 2018

Questionable value of inequalities in physics

Bill Zajc brought my attention to a very good talk that Raphael Bousso gave about his recent and older work. Inequalities play a very important role in his work. I am much willing to appreciate the value of an inequality than what I was when I was a kid or a teenager. But much of that sentiment has survived: I don't really believe that a typical inequality tells us too much about the laws of physics.

First, my initial realization is that inequalities incorporate much less information than identities. Imagine that you're asked how much is \(8+9\). Many of you will be able to answer\[


\] The percentage of TRF readers who can do it is significantly higher than in almost all other websites in the world. ;-) OK, but some people could also say that they're not quite sure but\[

8+9 \gt 10.

\] Eight plus nine is greater than ten, they figure out. That's nice and it happens to be true. But this truth is much less unique. In fact, someone else could say\[

8+9 \gt 12

\] which is another inequality of the same type – a strictly stronger one, in fact.

So inequalities seem to be "far less unique" than identities. You could ask what is the strongest possible inequality of this kind. The answer would be something like\[

8+9 \gt 16.999\dots

\] Well, there is no single "strongest" inequality of this kind because the set of numbers that are smaller than \(8+9\) has a supremum but not a maximum – the limit, \(17\), is already outside the set. So you may replace the inequality by the statement that "the supremum of a set is \(17\)" but if you do so, the statement becomes an equality or identity. It is no longer an inequality.

Now, if you have competed in mathematical olympiads, you must have solved a large number of problems of the form "prove the inequality [this or that]". There are lots of inequalities you may prove. For positive numbers \(X_i\gt 0\), \(i=1,2,\dots N\), the arithmetic average is larger than the geometric one:\[

\frac{1}{N}\sum_{i=1}^N X_i \geq \sqrt[N]{\prod_{i=1}^N X_i}.

\] Whenever \(X_1=X_2=\dots = X_N\) is violated, the sign \(\geq\) may be replaced with the sharp \(\gt\). That's great. As kids, you may have learned some proofs of that inequality – and similar ones. You may have invented your favorite proofs yourself. Some of the fancier, "adult" proofs could involve the search for the minimum using the vanishing derivative. Many of us loved to design such transparent proofs and we were sometimes told that these "proofs based on calculus weren't allowed". But the proofs based on calculus are straightforward. Even in the "worst possible case", the inequality still holds, so it holds everywhere.

I don't want to waste your time with particular proofs. But what I want to emphasize is that the inequalities – such as the ordering of the arithmetic and geometric mean – are purely mathematical results. You may prove them by pure thought. The inequalities have some assumptions, such as \(X_i\gt 0\) here, but everything else follows from the laws of mathematics.

A point you should notice is that no laws of physics are needed to prove a purely mathematical inequality. Equivalently, when you prove such an inequality, you're not learning anything about the laws of physics, either. Imagine that you may hire as many great pure mathematicians as you want. There are many candidates and most of them are unable to look for the right laws of physics – which needs some special creativity as well as a comparison with some empirical data.

With these employees, it's clear that you're no longer interested in the detailed proofs of the inequalities. There are many ways to prove an inequality. You're not even interested in the inequalities themselves – there are many inequalities you may write down, as the example \(8+9\gt 10\) or \(12\) was supposed to remind you.

Instead, with this team of collaborators, you will be interested in the assumptions that are needed to prove the inequality.

So the statements such as \(X_i\in \RR^+\) may remain important because they're the types of statements that remain relevant in physics. In the context of physics, we have lots of defining identities for physical quantities such as the density of the electromagnetic energy:\[

\rho = \frac{|\vec E|^2 + |\vec B|^2}{2}.

\] By pure mathematics, the real vectors \(\vec E,\vec B\) automatically give you \(\rho \geq 0\). Is that statement important? Is it fundamental? Well, it's important enough because you need the positivity of the energy to make many other, physically important statements. The vacuum is stable. Superluminal signals or tachyons are outlawed. And so on. But I would say that the statement isn't fundamental. It's a derived one, almost by construction.

In physics, the energy conditions – some variations of the positivity of the energy density – is an intermediate case. Sometimes, you want to view it as a purely derived mathematical statement that follows from others. Sometimes, you want to view it as a general axiom that constrains your theories – and these theories' formulae for the energy density \(\rho\) in terms of the more fundamental fields. Only in the second approach, the energy conditions may become "fundamental". And I think that the fundamental status of such theories (or axiomatic systems) is unavoidably temporary.

As we agreed with Bill, there are two inequalities linked to important principles in old enough physics. One of them is\[

\Delta S \geq 0.

\] The entropy never (macroscopically) decreases. It's the second law of thermodynamics. Just like in the case of the energy conditions, it may be either viewed as an axiom or a fundamental principle; or as a derived mathematical statement.

In thermodynamics, the second law of thermodynamics is a fundamental principle. Thermodynamics was formulated before statistical physics. People were trying to construct a perpetuum mobile and after some failed attempts, they realized that the efforts were probably futile and their failures could have been generalized: the perpetuum mobile is impossible.

Some would-be perpetuum mobile gadgets are impossible because they violate the first law of thermodynamics, the energy conservation law. Others are impossible because they need the heat to move from the colder body to a warmer one, and processes like that are also impossible. They tried to think about the various ways to describe what's "wrong" about these apparently impossible processes and they invented the notion of entropy – decades before Ludwig Boltzmann wrote entropy as the logarithm of the number of macroscopically indistinguishable microstates:\[

S = k_B \ln N

\] Within Boltzmann's and other smart men's statistical physics, the second law becomes a mathematically derived law. The principle may suddenly be given a proof and the proposition along with the proof is usually called the H-theorem. My personal favorite proof – discussed in many TRF blog posts – is using the time reversal. The probability of the transition \(A\to B\) among two ensembles of microstates is related to the probability of \(B^* \to A^*\), the time-reversed evolution of the time-reversed states.

The probability for ensembles is calculated as a sum over the final microstates – \(B_i\) or \(A^*_j\) in this case. The summing appears because "OR" in the final state means that we don't care which microstate is obtained and the probabilities in this kind of final "OR" should be summed. But when it comes to the initial state, the probabilities should be averaged over the initial microstates. (The difference between summing and averaging – operations that take final and initial microstates into account – is the ultimate source of all the arrows of time. The past differs from the future already because of the basic calculus of probabilities applied to statements about events in time. Everyone who claims that there's no arrow of time at the level of basic probability and the asymmetry has to be artificially added by some engineering of the laws of physics – e.g. Sean Carroll – is a plain moron.) The averaging could be arithmetic but it could have some unequal weights, too. "OR" in the assumptions or the initial state means that the initial pie has to be divided to slices and the evolution of the slices has to be computed separately. The factor of \(1/N_{\rm initial}\) arises from the need to divide the initial pie of 100% of the probability.

OK, so the probability for \(P(A\to B)\) is a sum over the \(A_i,B_j\) microstates with the extra factor of \(1/N_A\); for \(P(B^*\to A^*)\), the extra factor is \(1/N_{B^*} = 1/N_B\). The numbers \(N_A,N_B\) may be written as the exponentiated entropies, \(N_A = \exp(S_A/k_B)\) etc., and when the entropies of \(A,B\) are macroscopically different, \(N_A,N_B\) differ by a huge number of orders of magnitude. Probabilities cannot exceed one so at most one of the two probabilities is comparable to one, the other must be infinitesimal i.e. basically zero. The probability that may be comparable to 100% is the probability of the evolution from a smaller \(N_A\) to a larger \(N_B\) because the fraction \(1/N_A\) isn't suppressing the number so much; the reverse evolution is prohibited! That's a very general proof of the second law. The conclusion is that either the probability \(P(A\to B)=0\) or \(N_A\leq N_B\).

That's nice. Statistical physics has allowed us to demystify the principles of thermodynamics. These principles are suddenly mathematical implications of models we have constructed – a huge class of models (the proof is easily generalized to quantum mechanics, too). It's a great story from the history of physics.

With hindsight, was the inequality \(\Delta S\gt 0\) important? And what did it allow us to do? Well, I would say that the thermodynamical version of the second law – when it was an unquestioned principle – was useful mainly practically. It has saved lots of time for sensible practical people who could have developed new engines instead of wasting time with the hopeless task to build a perpetuum mobile. Thermodynamics has been praised by Einstein as a principled counterpart of relativity – a theorists' invention par excellence. However, there's an equally good viewpoint that dismisses thermodynamics as a method of ultimate bottom-up phenomenologists if not engineers!

Those people were mostly practical men, not theorists. Did the principle help theorists to build better theorists of Nature? I am not so sure. Well, people finally build the statistical physics and understood the importance of the atoms and their entropy etc. But that progress didn't directly follow from the principles of thermodynamics. One may verify that the atomic hypothesis and statistical physics allow us to justify lots of previous knowledge from thermodynamics and other branches of physics. But you need to guess that there are atoms and you should count their microstates after an independent divine intervention. The principles of statistical physics aren't a sufficient guide.

And if you only want to understand the laws of Nature "in principle", one could even extremely argue that you don't need the second law of thermodynamics at all. Without some understanding of the law, you would have no idea what you should build as an engineer etc. Well, your ignorance would be embarrassing and hurtful even for some folks who are much more theoretical than inventors building new engines. But it's still true that from an extreme theorist's perspective, the second law of thermodynamics is just one mathematical consequence of your law of physics for the microstates that you don't need to know if you want to claim that you understand how Nature works in principle. (Just to be sure, I don't invite young readers to become theorists who are this extreme Fachidiots. Sometimes it's useful to know that there's some world around you.)

The second big inequality of well-established physics I want to mention is the uncertainty principle, e.g. in the form\[

\Delta X \cdot \Delta P \geq\frac{\hbar}{2}.

\] Using the apparatus of the wave functions, that inequality may be proven – a more general one may be proven for any pair of operators \(F,G\) and the commutator appears on the right hand side in that general case. For \(X,P\), the inequality is saturated by Gaussian packets moved anywhere in the phase space. Again, the inequality may be understood in two historically different mental perspectives:
  • as a principle that tells us something deeply surprising and incompatible with the implicit assumptions of many people who thought about physics so far
  • as a derived mathematical consequence of some laws after those laws become known.
These two stages are analogous to the second law of thermodynamics. That law was first found as a "moral guess" explaining the continuing failure of the (not yet?) crackpots who had been working on perpetuum mobile gadgets. Those generally assumed that it was possible but the principle says that it isn't possible. Here, almost all physicists assumed it was always in principle possible to measure the position and momentum at the same time but it isn't possible.

The second role of the uncertainty principle is a derived mathematical fact that follows from some computations involving wave functions, their inner products, and matrix elements of linear operators. That's analogous to the H-theorem – the inequality is derived from something else. That "something else" is finally more important for practicing physicists. In particular, \(XP-PX = i\hbar\) is an identity that replaces the inequality above. This identity, a nonzero commutator, is more specific and useful than the inequality although by some slightly creative thinking, one could argue that the nonzero commutator "directly" follows from the inequality.

Being skeptical about the value of inequalities since my childhood, I have gradually refined my attitude to similar claimed breakthroughs. If someone talks about some important new inequality, I want to know whether it is a postulated principle – that cannot be proven at this moment – or a derived mathematical fact. If it is a derived mathematical fact, I want to know what it is derived from, what are the most nontrivial assumptions that you need to make to complete the proof. These assumptions may be more important than the final inequality.

If it is claimed to be a postulated principle without a proof, I want to know what is the evidence, or at least what problems such an inequality would explain, and whether the inequality is at least partially canonical or unique, or whether it is similar to \(9+8\gt 10\). My general attitude is: Don't get carried away when someone tries to impress you with a new inequality. Inequalities may be cheap and non-unique.

The second law of thermodynamics and the uncertainty principle were examples of "valuable inequalities in well-established physics". The energy conditions arguably belong to that category, too. In the context of general relativity, the Null Energy Condition is one that is most credible. It makes sense to believe that \(T_{\mu\nu} k^\mu k^\nu \geq 0\) for any null vector \(k^\mu\). When some cosmological constant is nonzero, you probably need to add some terms, and when some entropy flows through some places, you need to fix it, too. Raphael knows that the Null Energy Condition (NEC) is the most appropriate one among the possible energy conditions. I think that good physicists generally agree here. Weak and Strong Energy Conditions may superficially look natural but there are proofs of concepts indicating that both may be violated.

The vague concept of the energy condition is important because that's what may be linked to the stability of the vacuum – if energy density could be negative, clumps of positive- and negative-energy matter could spontaneously be born in the vacuum, without violating the energy conservation law, and that would mean that the vacuum is unstable. Related, almost equivalent, consequences would be traversable wormholes, tachyons, superluminal signals and influences, and so on. One may show the equivalence between these principles – or the pathologies that violate the principles – by thinking about some special backgrounds etc.

One may also think about more complex and more general backgrounds and situations and look for more complicated versions of "the" inequalities. What is "the" generalization of the inequalities for one case or another? Well, I am afraid that there isn't necessarily a good canonical answer. The set of inequalities in mathematics contains lots of useless gibberish such as \(9+8\gt 10\) and it seems that if "being a generalization of some other inequality" is your only criterion, you're still far more likely to find gibberish than something important.

When it comes to holography, we generally agree that the bound on entropy of the form\[

S \leq \frac{A}{4G}

\] is the most general and "deep" insight of that type. Jacob Bekenstein has been essential to find this kind of laws. But he's also found the other Bekenstein bounds. There were various products of energies and radii on the right hand side. These laws generally applied to static situations only. But were the laws true? And were they fundamental?

Well, I don't know the answer to the first question but what I can say is that I don't really care. If it's true that the entropy is never larger than some product of a radius and the energy defined in a certain way, well, then it's true. But I will only believe it if you give me some proof. And the proof will unavoidably be similar to a proof of a purely mathematical inequality, such as the ordering of the arithmetic and geometric means. And when something may be proven purely mathematically, there's just no physical beef. Some proposed inequalities may be proven to be true, others may be proven to be false. But both groups contain infinitely many inequalities and most of them aren't really special or insightful. So why should we care about them? They will remain mathematical technicalities. They can't become deep physical principles – those should be more verbal or philosophical in character.

Some two decades ago, Raphael Bousso began to produce the covariant entropy bounds. They were supposed to hold for any time-dependent background and the maximum entropy crossing a surface \(\Sigma\). could have been bounded by some expressions such as \(A/4G\) for some properly, nontrivially chosen area \(A\), assuming that the surface \(\Sigma\) was null. Despite the fact that I think that the choice of the null slices proves Bousso's very good taste and is more likely to be on the right track than with spacelike or timelike slices, I still feel that the non-uniqueness of such inequalities may be even more extreme than the examples of the assorted static "Bekenstein bounds", and I haven't ever cared even about those.

In all such cases, I want to know what are the most nontrivial assumptions from which such inequalities, assuming they are true, may be proven – in that case, I am more interested in these "more fundamental" assumptions than the inequalities themselves. And if the inequalities are sold as principles with consequences, I want to know what are the proposed consequences, why it's better to believe in the inequality than in its violation. So I want to know either some assumption or consequences of such inequalities that are already considered important in the physics research, otherwise the whole game seems to be a purely mathematical and redundant addition to physics – not too different from a solution to a particular and randomly chosen exam problem.

That seems important to me because lots of this "unregulated search for new principles" is nothing else than indoctrination. Penrose's Cosmic Censorship Conjecture is an example. It may be a rather interesting – perhaps mathematical – question about solutions to the classical general relativity. But Penrose also offered us a potential answer, without a proof, at the same moment. And because he was so famous, people started to prefer his answer over its negation even though there was no truly rational reason for that attitude. With influences by famous people like that, physics may easily deteriorate to a religious cult. And the faith in the Cosmic Censorship Conjecture has been a religious cult of a sort. Even the weakest "flavors" extracted from the Cosmic Censorship Conjecture are considered false in \(D\gt 4\) these days.

The holographic and entropy bounds are supposed to be very important because they should lead us to a deeper, more holographic way to formulate the laws of quantum gravity. But is that hope justifiable? We saw that even in the case of the second law of thermodynamics where the relationship was actually correct, the principle of thermodynamics wasn't a terribly constructive guide in the search for statistical physics and the atomic hypothesis. In the case of the entropy bounds, we may expect that those won't be too helpful guides, either. On top of that, the very "sketch of the network of laws" may be invalid. The fundamental laws of quantum gravity may invalidate the entropy bounds in the Planckian regime and so on.

So it's possible that something comes out of these considerations but one must be careful not to get brainwshed. These covariant entropy bounds and similar stuff was a set of ideas that was supposed to lead to insights such as "entanglement is glue" – to the entanglement minirevolution in quantum gravity. But the historical fact seems to be that the entanglement minirevolution was started by very different considerations. As guides, the covariant entropy bounds etc. turned out to be rather useless.

One must be equally careful not to get brainwashed by a religious faith in the case of the Weak Gravity Conjecture (WGC), another inequality or a family of inequalities that I co-authored. Gravity is the weakest force, a general principle of quantum gravity seems to say. What it means is that we must be able to find (light enough) particle species whose gravitational attraction (to another copy of the same particle) is weaker than the electromagnetic or similar non-gravitational force between the two particles.

There are reasons to think it's true that are "principled" – and therefore analogous to the "non-existence of perpetuum mobile of the second kind" or the "tachyons and traversable wormholes" in the case of energy conditions. Among them, you find the required non-existence of remnants and the need for extremal black holes to Hawking evaporate, after all. And there are also ways to argue that the Weak Gravity Conjecture is true because it's a derived fact from some stringy vacua – although these proofs are less rigorous at this moment, they're analogous to the derivation of the H-theorem encoding the second law of thermodynamics.

We would like to know the most exact form of the WGC for the most general vacuum of quantum gravity. And we would also like to find the theory (probably a new description of string theory) that makes the validity of this inequality manifest. So far, the proofs of WGC may exist within families of string vacua (or descriptions) but the proof heavily depends on the description.

I think it's fair to say that – unless I missed something – there is no solid reason to think that there exists "the" canonical form of the most general WGC-style inequality. The existence of a unique inequality is a wishful thinking. Lots of inequalities may still be true but they may resemble \(8+9\gt 10\). So people must be warned. All of it looks very interesting but you may end up looking for a holy grail that doesn't exist. Well, it may exist but I can't guarantee (prove) it for you.

And even if we understood the most general form of WGC and what it implies for many vacua, would it help us to find the deeper formulation of string/M-theory? This statement is also uncertain. String theory seems to be more predictive than effective quantum field theories where WGC may apparently be easily violated. But effective QFTs probably mistreat the black hole interior and other things. Maybe if you just require some higher spacetime consistency, the WGC may follow – directly from some refined pseudo-local spacetime treatment. There have been lots of interesting papers linking the validity of WGC to other, seemingly totally different inequalities – well, including the aforementioned Cosmic Censorship Conjecture.

Many of us still feel that some very deep insights could be waiting for those who "play" with very similar ideas but this belief isn't demonstrated and it may be false in the future, too. I want people to think hard about it but only if they realize that no one can promise them that such a search will lead to a breakthrough. Even if someone found a real breakthrough while playing with WGC, I wouldn't take credit for it because it could have been a coincidence what the person plays with "right before" she makes the big new discovery.

At the end, even if the thinking about WGC could help you to think about the "right type of questions" – how is it possible that string theory imposes this constraint that effective QFTs seem to be indifferent to – there are probably other and perhaps more direct ways, completely avoiding the WGC, to get to the deeper principles. There have been ways to formulate quantum mechanics without thinking about the Heisenberg inequality, too. After all, Heisenberg wrote down quantum mechanics in 1925 and the inequality was only pointed out in 1927 – two hours later! (OK, without the warp speed, it was two years, thanks to Paul.) So by basic chronology, the inequality couldn't have been too useful in the search for the new laws of modern physics – quantum mechanics. At the level of chronology, the example of the uncertainty principle is different from the example of the second law.

When we generalize these thoughts a little bit more, it seems reasonable that bright people who will play with these and similar ideas are more likely to make a breakthrough. But the inequalities such as generalized energy conditions, generalized holographic bounds, and weak gravity and similar conditions are just players that you may use for orientation, not to get lost in the spacetimeless realm without any fixed point. But there's no "really strong evidence" supporting the belief that the playing with such inequalities will be very helpful. It might be that most of the work spent by games like that will be analogous to the purely mathematical efforts designed to prove the mathematical inequalities such as the inequality between the arithmetic and geometric means.

At the end, what we really want are the truly fundamental new principles and I think that inequalities can't be new principles of full-blown (e.g. constructive) theories.

And that's the memo.

No comments:

Post a Comment