Saturday, May 23, 2009

Shocks in mathematics and physics

Edward Witten recently posted a written version of his talk on the 2008 Raoul Bott celebration.

Its goal is to explain the mathematical audiences why the Langlands duality becomes a pretty much inevitable - if not obvious - a fact in the context of field-theoretical limits of string theory.

That's a good opportunity to say a few words about the large differences between mathematicians and physicists in their appraisal of surprise.

The Langlands duality relates, among other things, properties of the root lattice of one group and the weight lattice of another group. It implies the existence of lots of "independent" relationships between various mathematical structures whose "a priori" probability to hold would be essentially zero. So they're very shocking.

But are they? Well, it depends how you calculate the probabilities. Once you show that all these statements are consequences of one unifying principle, it follows that they're not really independent. Because they're not independent, you can't show that their probability to simultaneously hold is a large power of a small number: it's just one small number - which is not too small. We have encountered this issue in a recent discussion about confidence levels in physics initiated by David Berenstein.

And indeed, the Langlands duality is a reflection of one principle, the S-duality in field theory.

One can give all the mathematical quantities in the Langlands duality a physical interpretation. Once it is done, all the statements follow from the so-called S-duality, the statement that one gauge theory with the coupling constant "g" is equivalent to another gauge theory (or the same one) with the coupling "1/g". They are really one gauge theory, in the physical sense.

The S-duality is a generalization of the electromagnetic duality in four dimensions. Equations of electromagnetism can be written in the relativistic form, using the antisymmetric tensor "F_{mn}". In four dimensions, it is possible to reshuffle this tensor into its dual, "F*_{mn}", by summing it with the antisymmetric tensor "epsilon_{mnpq}". If we allow the magnetic monopole density and currents to be nonzero, the equations of electromagnetism exhibit a symmetry between electricity and magnetism: the form of the equations is pretty much identical if "F_{mn}" and "F*_{mn}" are interchanged, together with the electric and magnetic currents.

To show this symmetry for ordinary electromagnetism is trivial: we are only reshuffling the components of a tensor, after all. However, this symmetry has generalizations in the non-Abelian, Yang-Mills case, where it must be accompanied by an appropriate modification of the gauge group. While "SU(N)" is pretty much mapped to the same "SU(N)" (the "A_{l}" algebras are self-dual), up to discrete subtleties, "O(2N+1)" is mapped to "USp(2N)" ("B_{l}" is switched with "C_{l}"), and so on.

These observations can be argued to hold and one can arguably construct a full proof using one of many available strategies. S-duality reduces the huge number of a priori independent statements about the Langlands duality into a much smaller set of independent statements. This smaller set may still look too large. Can we keep on going and unify it?

Yes, we can. S-duality of 4D gauge theories results from the very existence of some 6D local quantum field theories.

In the middle-to-late 1990s, string theorists realized that local, finite, interacting quantum field theories exist above four dimensions. Before that, it was generally believed that because no superficially renormalizable interactions can be added to the Lagrangians above "d=4", interacting theories above this threshold must inevitably be sick.

It turned out that the assumption was incorrect. If we want to obtain local field theories from string theory, we must get rid of gravity because gravity always brings some degree of non-locality (think e.g. about the tunneling of information from the black hole interiors). We say that "gravity must be decoupled".

If you study dynamics of D3-branes in type IIB string theory, you may take the limit in which gravity and massive string states are decoupled. What you're left with is N=4 supersymmetric Yang-Mills theory in four dimensions. We can write it in the Lagrangian form, too.

However, M-theory contains M5-branes and one can still take a similar limit of "N" coincident M5-branes (complicated dynamics!) and decouple gravity. Because analogous proofs still exist, assuming well-known general properties of objects in string theory, the result of this limiting procedure must be a 6-dimensional local quantum field theory.

It has a stress-energy tensor and other local operators that have correlators much like the N=4 d=4 Yang-Mills theory's operators do. But the six-dimensional theory has no classical, Lorentz-invariant Lagrangian. Despite this difference, much like the N=4 gauge theory in d=4, it exists and it has 16 supercharges. The minimal spinor in 5+1 dimensions has 8 real components, so this supersymmetry has 2 minimal units of supersymmetry. In 5+1 dimensions, the left-handed and right-handed components of SUSY can be added independently, not being the CPT conjugates of each other, so with 16 supercharges, we must still distinguish (1,1) and (2,0) supersymmetry. The new field theories in "d=6" happen to be (2,0) theories. That's also the SUSY algebra of type IIA (i.e. (1,1) SUSY in d=10) NS5-branes - the M5-branes can be obtained as a limit of these IIA NS5-branes - while the SUSY algebra in type IIB (i.e. (2,0) SUSY in d=10) is of the (1,1) form.

These exotic six-dimensional theories have their holographic, AdS_{7} duals, they appear as matrix models for M-theory on T^5, and they have many other important applications. Clearly, we want to discuss another important application - their ability to explain S-duality.

If you compactify a (2,0) theory on a tiny two-torus with shape (complex structure) given by a complex number, "tau", you obtain a "d=4" supersymmetric gauge theory whose complexified coupling constant is "tau". Because different values of "tau", the shape of the torus, are physically identical if they are related by SL(2,Z) modular transformations, you can immediately see that four-dimensional maximally supersymmetric gauge theories have an SL(2,Z) S-duality - a duality identifying theories with different values of the coupling "tau".

With the existence of the (2,0) theory taken as a fact, the S-duality of gauge theories become trivial. Now, you may obviously keep on climbing "up" in the explanatory ladder. The (2,0) theories exist because string/M-theory does and because the latter has certain basic properties. Even in mathematics, string/M-theory is the ultimate primordial explanatory "idea" behind a majority of important propositions.

By the way, you could also say that D3-branes in type IIB string theory are objects in a 12-dimensional "F-theory" (in the 12D sense) that is compactified on a similar two-torus, and the SL(2,Z) action on the two-torus would be equally manifest. However, F-theory, while a more "fundamental" starting point, is more formal and probably harder to prove to exist than the (2,0) theory is. It also contains many objects and phenomena that are unnecessary if you're only interested in S-dualities of d=4 gauge theories.

More generally, we can ask: Does string/M-theory with the basic desired properties exist? Can you prove it? Well, everything indicates that it does exist but there is no "fully rigorous proof" of the existence of the entire structure with all the desired properties. On the other hand, its existence looks virtually unquestionable, given the available circumstantial evidence. It is not "much less likely" than the validity of any of the partial, modest statements about the Langlands duality. And it is vastly more powerful.

I find it obvious that if there are two facts, they seem equally likely or maybe even equally provable, but one of them is incomparably more powerful, it is more interesting and should be more intensely studied.

Mathematicians often disagree with this viewpoint. The primary reason is that they really distinguish "p=0", "0<p<1", and "p=1". They don't divide the intermediate group finely.

What do I mean? If mathematicians prove or disprove something, things become settled. But if they don't, the issue is completely uncertain for them. In other words, mathematicians don't like to collect partial evidence. They hate it. They only love complete proofs. It also means that all "unifying structures" remain "heuristic speculations" for them until they are proven or disproven.

Scientists can't possibly operate in this way because almost every proposition they have ever studied has the probability to hold somewhere between 0 and 1 - and it is very important how close to 0 or 1 it actually is. If they decide that the probability of XY is "p=0.9999", arguments based on the assumption that XY holds are not called "heuristic". Instead, they're "very solid scientific arguments".

I find it obvious that the scientific treatment - with different values of "p" which actually influence which things we study and how - is the superior operational strategy to reach the right answers to many questions. The mathematicians must know so. But for some reason, their goal is really not to find the right opinions about as many questions as possible. They are only interested in finding answers they can be 100% certain about. It's a somewhat dogmatic attitude, especially if you realize that mathematicians are fallible, too.

Note that this dogmatic attitude is largely independent of the fact that mathematicians study an artificial world of invented concepts rather than the real world. Even if you study artificially invented concepts, you could still adopt the scientists' approach and carefully distinguish different degrees of probability. In the context of mathematics, this approach is called "heuristic mathematics". For some reasons, it is pursued by a very small number of professionals.

There are two "major" groups - scientists who carefully distinguish different confidence levels, and mathematicians who don't. In the previous paragraph, we mentioned that mathematicians could also distinguish different confidence levels. Finally, there also exists the fourth group, scientists who only care about rigorous results. That's a somewhat inconsistent combination because statements about the real, messy world can't ever be "quite certain". The people calling themselves "mathematical physicists" are perhaps the closest thinkers to this description. However, the price they pay for their rigor is that the objects they study are often just inspired by physics: they're not full-fledged physical systems per se.

There is one more fact that should be mentioned: probabilities that hugely general, powerful propositions hold are nonzero in general.

If you find four positive integers "x,y,z,n" such that
xn + yn = zn,
you will notice that "n=1" or "n=2". It will hold for the first 500 examples you find. Will you try to look for more examples? You may also lose your patience and decide that it is hopeless to look for any counterexamples where "n" is greater than two.

In fact, you should better your patience before it's too late. ;-)

As you know very well, there are no counterexamples with "n" greater than two. This bold claim is nothing else than Fermat's Last Theorem that has already been proven. Let's psychologically return to those years before it was proven. How would we estimate the probability that a counterexample would be found?

The theorem had been proved for individual exponents (and, automatically, for all of their multiples, simply because y^{pq} = (y^p)^{q}) for several centuries. It had been known that no counterexample could have existed for "n=3", "n=4" (the simplest proof!), and so on. The statement was known to hold for "n" equal to hundreds. But couldn't there exist a counterexample?

If you decided that the validity of the partial Fermat's Last Theorem for each prime exponent were an independent question, it would be clear that you would have to view its validity for "n=3", "n=4", "n=5", "n=7" etc. as unlikely and independent miracles. They wouldn't change anything about the possibility that the theorem could break down e.g. for "n=2^{43,112,609}-1", the largest known prime. A counterexample could exist.

You could also take the attitude that if 100 exponents were verified to obey Fermat's Last Theorem, the probability that a random prime exponent violates it is comparable to 1%. You could think that about 100 additional exponents would obey the theorem and then, a counterexample would probably be found.

But you must realize that among those 100 exponents, there had been no counterexample, so the naively calculated probability was still 0/100=0. The "measured" probability that a random exponent violates the theorem was between 0% and 1%, if you wish.

Now, imagine that Fermat's Last Theorem is reinterpreted as a question in natural sciences - or that you find an analogous situation in science. 

Many people around physics who like to ignore mathematics often believe that the probability must strictly be greater than zero because no fully certain, "p=0" or "p=1" propositions exist in science.

Except that they do. It is always possible that a finite but large number of "hints" that something holds generally actually implies that the things holds generally. If you wish, the probabilistic distributions for probabilities of scientific propositions (and it's no mistake that there are two probabilistic words embedded into each other here!) can always have a "Bose-Einstein condensate" or a "delta-function component" located at "p=0" or "p=1". So "p=0" or "p=1" can appear with a nonzero probability.

That's a consequence of the fact that no related or adjacent objects in mathematics or science can ever be thought of as being "completely independent" or "completely unrelated". Everyone and everything has a "common ancestor", in some very general sense. Consequently, probabilities of assertions about them can never be thought of as "quite independent", either. It is a priori completely unknown whether two confusing propositions are "completely unrelated" or "relatively unrelated" or "virtually equivalent".

If a general scientific or mathematical statement seems to be supported by very many pieces of evidence, we should always admit the possibility that the statement actually holds in the full generality and accuracy - unless we have strong rational evidence (referring to the technical features of the given situation, not some completely general or vague philosophical preconceptions) that it is not the case.

Theorems often exist. General propositions often hold. And if they do, it is often (or always) the case that a proof must exist. And if a proof exists, humans have a chance to find it at some point. They have already found many proofs in the past.

I feel that this basic ability to generalize - and the modest requirement that people accept at least the possibility that certain principles and propositions can hold generally and completely accurately - is largely absent among the public, including parts of the scientific public.

This sociological observation seems to be demonstrated by the unwillingness of so many people to accept the very possibility that the quantum mechanical postulates hold exactly, much like the local rules of special relativity. That's just too bad because these people, in effect, believe in the dogma saying that "there are no laws in Nature" and "no amount of evidence can ever change that". I am sure that they are profoundly wrong.

And that's the memo.

1 comment:

  1. Hi,

    We have just added your latest post "Shocks in mathematics and physics" to our Directory of Science . You can check the inclusion of the post here . We are delighted to invite you to submit all your future posts to the directory and get a huge base of visitors to your website.

    Warm Regards Team