Tuesday, February 16, 2010

Unitarity, absence of anomalies, locality, causality, Lorentz invariance: consistency conditions

Nature depends on several paramount conditions that are needed for Her to operate at the basic level. We may call them "consistency conditions."

They have to be satisfied by all remotely realistic physical theories describing our world - or any world that at least qualitatively resembles ours - otherwise the theory instantly runs into serious problems.

Although the list is not really completely, I will be talking about the inevitability of several of them, namely
  • unitarity
  • absence of anomalies
  • locality
  • causality
  • Lorentz invariance,
as well as various relationships between them and their implications.


In linear algebra, unitarity is a property of a transformation, or a matrix, "U", that generalizes the "orthogonality" of real transformations to the case of vector spaces with complex coordinates. Such a matrix has to satisfy
U U = 1.
Why is it needed in any quantum mechanical theory? It's because the squared complex amplitudes - coefficients in front of individual basis vectors, or states - determine the probabilities that the particular state will be seen. The sum of these probabilities is simply
Prob = v v
where "v" is the final state vector. But the total probability of all mutually exclusive possibilities must be equal to one. And it indeed does. Assuming that the initial vector "u" is normalized and the initial and final vectors are linked by the unitary evolution operator "U", the total probability will be equal to one:
Prob = v v = (Uu) Uu =
= u U U u =
= u 1 u = u u = 1.
A unitary evolution operator is really the only way to get such a general result for a generic initial state. So the S-matrix (describing scattering) has to be unitary. And because the evolution operators for very short time intervals have to be unitary as well, we may deduce that the Hamiltonian has to be Hermitean whenever it exists (because the logarithm of a unitary matrix has to be "i" times Hermitean).

The energy's being a real number is linked to the conservation of the total probabilities.

This setup is necessary for any quantum theory to be compatible with the rules of logic. This unitarity requirement has two main aspects:

  • the Hilbert space must be positively definite (or at least semidefinite): no negative-normed vectors ("bad ghosts") are allowed in the physical Hilbert space because the probabilities of ordinary observable events can't be negative
  • it must be possible to prove that the S-matrix, or evolution operators, are unitary, so e.g. the Hamiltonian and similar evolution operators have to be linear and Hermitean whenever they exist
There are many reasons why something can go wrong. Various modifications of the basic framework of physics immediately lead to lethal problems - although the "speed of the doom" may differ. For example, if one denies that the evolution operators have to be linear Hermitean operators, the squared norms won't be preserved.

You might think that such a problem can be fixed in an ad hoc way. Can't one modify the formulae how to calculate the probabilities from the quantum equations? In particular, can't you divide all the predicted probabilities by their sum, so that their "renormalized" sum is equal to one?

Well, you might define such a theory. But all theories of this kind will be nonlocal, and the probabilities of observations that will actually be made will depend not only on observations in faraway places, but even on the possible observations that have actually never materialized.

For locality, whose importance will be discussed later, it's important that the effective "field operators" evolve according to local equations, so there can't be any ad hoc extra rescalings of the wave function that would depend on the value of the wave function in the whole Universe. While such a global method could be acceptable for the Big Government advocates, it is unacceptable for Nature.

The probabilities of an observation must only depend on your knowledge of the things inside the past light cone of an event - we will discuss these matters later - which prevents you from messing up with the basic framework of "unitary operators determining the evolution" in any way.

Absence of gauge anomalies

One of the diseases that can threaten unitarity is a violation of gauge symmetries. By gauge symmetries, I mean any transformations whose parameters depend on the position in time. If they do depend on time in a relativistic theory, they usually have to depend on space, too - because in relativity, what holds for time has to hold for space, too.

The most important examples of gauge symmetries are the U(1) electromagnetic symmetry, its non-Abelian or Yang-Mills generalizations, and the diffeomorphism (coordinate reparameterization) symmetry of general relativity.

While they may have seen "optional" in the classical theory, they become paramount in the quantum theory. Why? It's because they're needed to get rid of bad ghosts - states with negative norms (that would lead to predictions of negative probabilities in most processes). How do these bad ghosts arise?

Well, in a Lorentz-invariant theory, spacelike polarizations of the photons, gauge bosons, or gravitons are inevitably accompanied by timelike polarizations, too. However, the norm of a polarization depends on the sign of the entry in the metric tensor.

And because the sign of the time-time component in the metric tensor differs from the sign of the space-space component (which of them is positive and which of them is negative i.e. whether it's -+++ or +--- is a convention), it's clear that only one of the types of polarizations (spacelike or timelike) may have a positive norm (which is my convention for the correct sign that leads to positive probabilities).

If the spatial ones were the wrong ones, there could be "too many ghostly polarizations" and you couldn't ever get rid of them. Clearly, if the number of time directions exceeds the number of spatial dimensions, the "wrong" polarizations must be the time-like ones.

Wrong time-like polarizations

They - quanta generated by the time-like creation operator composed of the Fourier modes of "A_0" in electromagnetism, or its generalizations in Yang-Mills theory or Einstein's gravity - can be eliminated from the spectrum by requiring that all physical states have to be annihilated by all the generators of gauge symmetries: they have to be invariant under the gauge transformations.

This requirement means that the ghosts are just declared unphysical. But why do the generators that annihilate them have to be symmetries? Well, it's because if you guarantee that the ghosts are absent in the initial state, dynamics must also be forbidden from creating them by the evolution. Ghost-free states must lead to ghost-free states. In other words, the generators defining the absence of ghosts must commute with the Hamiltonian which means that they're symmetries of the system.

For these reasons, the cancellation of gauge anomalies is essential for the consistency of any quantum theory. Just to be sure, global symmetries (whose parameters are not functions of time - there are not infinitely many of them) are allowed to be anomalous. If such global anomalies exist, the global symmetry de facto disappears, physics changes, but it is no inconsistency because the global symmetries are not needed for the elimination of any ghosts.

If you have a theory with many more fields that carry vector indicates and that could produce negative-norm states (if the number of time-like indices is odd), you have to guarantee a corresponding number of new gauge symmetries that keep your system ghost-free. For example, open string field theory may be formulated as a Lorentz-covariant quantum field theory with infinitely many fields and infinitely many kinds of a gauge symmetry that are unified in a grandiose stringy gauge symmetry (which also employs the massive string fields as the primary "gauge bosons" or "gauge fermions").

These constructions may sound impressive, and they surely are pretty impressive, but you should understand that the mathematical apparatus of string field theory is just another reinterpretation or reorganization of the same perturbative results in string theory (the gauge symmetries are linked to the ordinary BRST transformations from a single-string Hilbert space) and it seems likely that the "full" string theory can't be non-perturbatively defined in terms of the string fields.

Gauge symmetries are pretty but they're just a tool to write the laws of physics in a way that keeps the Lorentz symmetry manifest, and deals with its bad side-effects (the emergence of bad ghosts) indirectly, by eliminating a part of the spectrum in a nice consistent way. If there are no gauge symmetries in a theory, it doesn't mean that the theory is bad. You can formulate these theories - and calculate all the cross sections between the physical states - in a way that avoids all bad ghosts and all gauge symmetries at all times. However, such a "manifestly unitary" formulation of the theory obscures the Lorentz symmetry, too.

Gauge symmetries are features of a description of physics, not a physical theory itself

Gauge symmetries are just redundancies of a particular description of a physical system that may be more or less convenient but that is not "objectively true" in any sense. After all, gauge symmetries are gadgets that have removed some ghosts from the set of physical possibilities. But because the ghosts are completely removed, they're no longer parts of the observable physics. They're gone, even in principle, so you can't ask what they have been in the first place.

Also, it's known that there often exist many descriptions of the very same physical system that have different gauge symmetries. For example, in the most famous AdS/CFT dual pair, the boundary description has an SU(N) gauge symmetry on the boundary. You won't find this symmetry in the bulk.

Instead, the bulk type IIB string theory may be conveniently formulated in terms of the diffeomorphism gauge symmetry - and perhaps its Ramond-Ramond p-form and stringy cousins - which are invisible on the boundary. The boundary and the bulk only agree about things that can be measured rather than unphysical things that you may add as intermediate steps to simplify the calculations of what can be measured.

I can tell you many other examples - the S-duality in gauge theories (and stringy vacua) that switches from the electric gauge group to the magnetic one, which was completely invisible to start with; commutative vs non-commutative gauge symmetries in Yang-Mills theories on noncommutative geometries which may be equivalent to the commutative ones; U(N) ni Matrix theory; mundane examples as the electroweak theory in the unitary gauge; and many others.

While gauge symmetries surely sound very natural and useful in the Standard Model, they're not "canonical" in general. They're just about a way to describe various physical systems in a way that is convenient for some theories in some regimes but less convenient for others.


Nature wants "local structures" to evolve kind of independently. It wants to avoid things such as voodoo, telepathy, or the global government at all costs. Also, science dramatically depends on our ability to separate physical phenomena and study them in isolation: we must always be able to correctly assume that a particular phenomenon is de facto if not de iure unaffected by trillions of phenomena that are taking place elsewhere in the world.

It seems that the only way how to achieve the "ban" on proliferation of superstition and left-wing politics in the laws of physics - which is needed both for Nature and science to work - is for Nature to require strict locality. ;-)

Locality follows from causality and the Lorentz invariance. Causality, which I have discussed many times, means that there must exist at most one causal relationship between two events A,B that take place somewhere in spacetime. Either A causes or caused or will cause B, or B causes or caused or will cause A, or none of it. It can't ever happen that the relationship goes in both directions because that would be equivalent to closed time-like curves with all the shot grandfather paradoxes (which would be genuine logical contradictions).

However, if A influences/causes B and B influences/causes C, then A influences/causes C. The relationship is transitive. The only way how to prevent the paradoxical double-sided relationships between pairs of events that are highly separated in time is to guarantee that the causal relationships only go in one direction even for shorter intervals. Moreover, they must be determined by a uniform arrow of time.

To summarize, the full principle of causality says that the events can only influence the events that happen later in time. In locally Lorentz-invariant theories, the previous sentence has to hold in every inertial frame. If B is "after A" according to all inertial frames, it means that B must be in the future light cone of A. So in Lorentz-invariant theories, events can only be influenced by other events in their past light cone.

Note that this conclusion, which we deduced from causality (that has something to do with the separation in time and time's arrow), automatically implies locality (that has something to do with space): signals can only spread at most by the speed of light. After time "t", you can't influence places that are more distant than "x=ct" from you.

Lorentz symmetry therefore unifies causality and locality - in the same way as it unifies time and space in the first place. These conditions are tightly linked. The more you think about the possible forms of the laws of physics, the more you see that they're linked with the other conditions, too.

The equivalence of inertial frames, known since the times of Galileo, has been confirmed by all experiments that have ever been done. Once we learn that the geometry at high speeds is Lorentzian rather than Galilean, the Galilean symmetry has to be replaced by the Lorentz symmetry. But the latter plays a similar role as the Galilean symmetry has played in the Newtonian physics. In particular, they have the same number of generators.

However, the Lorentz symmetry is "nicer" or "more generally deformed" in the sense that it treats time and space more equally. It allows them to be expressed naturally in the same units - with the speed of light being the conversion factor. Because of this "tighter" unification of space and time, it is able to link previously unrelated concepts and phenomena more tightly than the Galilean symmetry.

The conditions are not independent

It's important to realize that these conditions are fully compatible with each other. I have mentioned that sometimes we need to do some extra work and introduce new concepts to see how things work. For example, if we respect the Lorentz symmetry, there will be fields with vector indices. Their components may be time-like or space-like. And the two types create states with the different signs of their norm: one of which would lead to negative predicted probabilities.

So one had to introduce gauge symmetries that allowed us to protect the physical states - and the evolution in between them - from the bad ghosts (and negative probabilities). Gauge symmetries were an additional construction we were forced to discover because of the combined requirements of Lorentz invariance and unitarity (in quantum mechanics). It may have been hard to learn some extra thing - gauge symmetries - but it was surely not impossible. And the resulting physics can be described without them, too.

However, "having to learn something new" is a completely different situation from "seeing a physical incompatibility". There's no physical incompatibility between the consistency criteria above. Quite on the contrary, they actually reinforce each other physically. For example, you have to adopt the probabilistic interpretation of the amplitudes - and other postulates of quantum mechanics - in order to explain the observations of entanglement in agreement with the Lorentz invariance. Assuming any hidden variables or other non-quantum devices would lead to physical violations of locality and the Lorentz invariance.

With the correct probabilistic interpretation, this problem doesn't occur because Nature doesn't have to propagate any real superluminal signals from one place to another. It does what it wants and it only obeys the probabilistic prediction in a statistical sense - and all the laws relating the probabilities are exactly Lorentz-invariant and local because they can be described in terms of Lorentz-covariant equations of motion for the Heisenberg operators (the classical equations with the hats added, roughly speaking), and the probabilities of all observations can always be expressed as expectation values of projection operators which are functionals of these Heisenberg operators (the only objects that change in the Heisenberg picture).

Quantum field theory and beyond

The principles enumerated and clarified above are considered "standard" whenever people work with quantum field theory as a framework. They sometimes try to relax one or two of the requirements but they do understand that they're immediately on thin ice whenever they do so.

On the other hand, we know that the full theory of everything, including quantum gravity, is not exactly equivalent to a local field theory in the (bulk) spacetime. That's nice, and there's a lot of powerful evidence that quantum gravity is not strictly local or causal (holography, getting the information out of the black hole etc.). However, one must appreciate how extremely tiny is the space for maneuvers.

String theory only differs from quantum field theories in extremely subtle ways, and it has to. In some sense, it is still possible to imagine that it is a unitary, (nearly) local, (exactly) Lorentz-invariant quantum field theory with infinitely many fields. There are situations in which the reasoning in terms of a quantum field theory is not quite appropriate but the deviations from the quantum field theory framework is always very delicate. And it has to be.

All the self-described geniuses who think that they can seriously deviate from this framework are just deluded. They misunderstand some very basic points about the 20th century and 21st century physics. One would pay dearly if he wanted to fundamentally sacrifice the Lorentz symmetry, locality, causality (which includes some inevitably asymmetric arrow of time!), or even unitarity. These people often assume that the problems would only be contradictions with some direct observations - and they may hope that some observations (of Lorentz invariance) haven't been done "completely accurately" etc.

However, for mathematical reasons, the violation of any of the principles would actually launch a cascade of breakdowns of the other principles I mentioned because they depend on each other in very subtle ways. At the end, the "revolutionary physicist" would have to start all of physics from the scratch, and offer completely different and independent explanations for pretty much any observations that have ever been made.

The question why such a new theory gives almost identical predictions as the established 20th century theories would be extremely pressing. At the end, the "revolutionary physicist" would have to offer a general explanation that explains why her new theory agrees with the quantum-field-theory-based predictions so extremely accurately. Effectively, she would need to prove a new kind of a duality or a new kind of an unexpected expansion.

We may say that this is exactly what's happening in string theory whenever it's formulated by completely different equations than those that look like a quantum field theory in the bulk - e.g. by the BFSS matrix model or the boundary CFTs. But what's important is that we actually have super-powerful arguments, if not proofs, that show that these unusual ways to describe the quantum gravitational physics in spacetime are equivalent to the theories that look like quantum field theories at long distances.

However, the "revolutionary physicists" usually don't care about any physics - i.e about the agreement of their theories with the empirical data - which really means the (approximate) equivalence of their theories with the well-established theories at long distances (or otherwise accessible regimes). They just emit completely unphysical verbal crap which is almost universally less sophisticated, not more sophisticated, than quantum field theory. And they're able to defend the preposterous idea that their medieval theories represent progress only because of sheer populism - because most people (and maybe even most physics PhDs) hate quantum mechanics and relativity, anyway.

This should stop and spades should be called spades once again, especially the spades that are full of sh*t.

And that's the memo.

Bonus: double sexy victory for quantum probabilistic physics

One more argument that should settle the statement that the wave function has to be interpreted probabilistically.

Monster and Critics has told us the 10 most sexist songs ever, according to Billboard.

The sexiest video clip for a song ever is called "Physical" (Physics won!) and the song was sung by the niece of the probabilistic interpretation of quantum mechanics, Olivia Newton-John.

If you can't understand why she's its niece, recall that Max Born was the father of the probabilistic interpretation of the wave function and Olivia Newton-John is his granddaughter. But the probabilistic interpretation has never had sex with Newton, who died long before quantum mechanics was born, so this interpretation can't be the mother of Olivia Newton-John. The interpretation can't be her father, either, because it has no balls, except for the fuzzy ones in the phase space.

So it has to be an uncle or an auntie. It's that simple. ;-)

1 comment:

  1. Dear Lubos,

    Let me first congratulate you on this excellent report on consistency conditions underlying QFT.

    Having said that, I wish to caution that, discussing the postulates and the very fate of QFT beyond the scale of the Standard Model brings to the table many “not-so-obvious” subtleties.

    Let’s take for instance non-equilibrium dynamics in QFT and condensed matter physics (abbreviated here as NEFT). In NEFT, non-unitary evolution at the microscopic level of description takes central stage. Over the past two decades, NEFT has attracted a lot of interest as new experimental techniques have probed novel non-equilibrium quantum phenomena that require field-theoretical description. Special examples of theoretical interest involving non-equilibrium dynamics of quantum fields include: inflationary dynamics in the early Universe, electroweak baryogenesis, the chiral phase transition and quark-gluon plasma in ultrarelativistic heavy ion collisions, dynamics of phase transition in Bose-Einstein condensation, ultrafast spectroscopy of semiconductors, non-extensive statistics in particle physics and inflationary cosmology, models of the dark sector, non-equilibrium phase transitions in strongly correlated compounds, condensed matter phenomena with long range correlations, spin glasses and so on.

    Consider now how Lorentz invariance and locality may be impacted once we advance beyond SM and deep in the TeV sector. Defining luminal signals of finite speed of propagation implicitly assumes the continuity and differentiability of underlying space-time. One surprising speculation of NEFT is that space-time on very high energy scales becomes a multi-fractal as a result of scale invariance. If this turns out to be true (and, of course, this is big if), differentiability of space-time breaks down and the concept of “speed” is ill-defined. Once this happens, Lorentz invariance and locality may require a careful redefinition of terms.

    In closing, I am fully aware that these topics are controversial and you might very well disagree with my observations.

    Ervin Goldfain