Saturday, November 13, 2010

Quantum field theory has no problems

Many people face irrational problems when they attempt to understand special relativity or quantum mechanics and these problems have been discussed many times on this blog.

But in this text, I will look at somewhat more "technical" myths about the hypothetical problems of quantum field theory. Relativistic quantum field theory unifies the principles of quantum mechanics with those of special relativity - so all the difficult features of quantum mechanics and special relativity reappear in quantum field theory, too.

However, there exist additional hard features that are characteristic for quantum field theory and that may prevent folks from appreciating that quantum field theory is the correct theory of all non-gravitational phenomena ever observed, or a "theory of nearly everything" (TONE).

Infrared divergences

First, the Feynman diagrams - pictures that are translated to integrals whose values encode all predictions of quantum field theories - are known to produce various divergent (infinite) expressions. What do they mean? What do they tell us about the validity of a quantum field theory? Here, it is important to note that there are different sources of divergences and their meaning is very different for the different groups.

"Infrared divergences" is the name for the infinities that emerge because we have to integrate over arbitrarily long-wavelength (or low-energy) virtual particles (or quanta). They are produced when we try to send the minimum allowed momentum or energy of virtual particles to zero. When we do so, the loop diagrams are infinite. What does it mean? Does it mean that the theory sucks?

Not at all.

Let me say in advance that the opposite, ultraviolet (short-distance or high-energy) divergences - to be discussed later - usually show that a quantum field theory is incomplete and should be thought of as a limit of a more accurate theory, in the optimistic case. However, the infrared divergences don't imply anything of the sort.

The asymmetry between the two kinds of divergences arises because of a simple fact: the physics at long distances is derived from the physics at short distances - because you can build big houses out of small bricks and you can deduce a useful approximate theory for long distances from the fundamental theory at short distances - but the converse proposition doesn't hold. In quantum field theory, you can't deduce short-distance physics from the approximate laws for long-distance phenomena, just like you can't extract bricks from a photograph of a building (with a facade).

So what do you do about the infrared divergences? It is important to know that quantum field theory - or any science - admits two kinds of questions (with a whole continuum in between):
  1. questions that are are directly linked to the results of measurements - i.e. questions that are easily interpreted experimentally
  2. questions that are natural and simple from a theoretical viewpoint - i.e. questions that are easily connected with the fundamental concepts and quantities in the theory
The point is that these two categories are not identical. If someone likes straightforward theoretical calculations, he may be better in calculating the answers to the second brand of questions. And it is the second type of questions in which calculations may produce infrared divergences.

However, we never observe infinite quantities so divergent answers are unacceptable for the first class of questions. Still, I claim that the theory with infrared-divergent Feynman diagrams is 100% correct. How it can be? Well, it's the case because to answer the observational questions from the first group, we must realize that the questions of the second type were not directly appreciating some practical limitations of our devices etc. We must add a few more steps to connect the answers to the questions from both groups.

Let's look at the issues somewhat more technically.

Take Quantum Electrodynamics and calculate the cross section for the scattering of two charged particles. One of them may be a light one, one of them may be a heavy one. Perturbative quantum field theory will lead you to a Taylor expansion in the fine-structure constant (or the electric charge).

The leading graph is a tree diagram; it has no loops. In this diagram, the particles simply exchange one virtual photon. This diagram will coincide with the predictions you could have made using classical physics. However, there are also loop corrections. Already the one-loop corrections - the first contribution of quantum effects that modifies the classical predictions - will include infrared divergences.

You obtain a term in the amplitude that is schematically proportional to ln(E_min) - where E_min is the minimum allowed energy of a virtual photon in the loop. Full rules of quantum field theory dictate that you should set this limit, E_min, to zero which produces a divergence.

Will this divergence be canceled? You bet. The actual thing that will be compared with the experiments is actually the cross section of an observable process. The squared amplitude, [finite + ln(E_min)]^2, will also produce terms such as (finite^2 + 2 finite ln(E_min) + ...). The dots contain higher powers of the fine-structure constant. But will the term proportional to ln(E_min) be canceled?

It will be as soon as you realize that a real experiment can't observe photons of arbitrarily low energies. No experiment can do so. The correct question of the first, "observational" type is actually the question what is the inclusive cross section in which you allow an arbitrary number of photons that are invisible to your apparatus.

So you will have to compute not only diagrams whose external lines are the two charged particles whose repulsion you want to calculate. Instead, you also have to be interested in the diagrams with an extra external low-energy (soft) photon. It's so soft that your device cannot see it.

Because we have only considered the ln(E_min) subleading term - which was suppressed by a single fine-structure constant - we also need to multiplicatively add just one fine-structure constant together with the soft photons when we compute the cross section. It actually means that we are only interested in the tree level diagrams with an extra soft photon at this level of accuracy.

If we calculate the inclusive cross section in which it is allowed - but not required - to produce an additional soft photon of energy below E_resolution, then we get another ln(E_resolution) term in the cross section from the tree-level diagram with the extra soft photon. Setting E_resolution=E_min is enough to see that the infrared divergence cancels (if you see the formulae). However, if we really want to obtain the finite remainder accurately, we can't make this assumption. Instead, it is useful to infrared-regulate the theory and e.g. make the photon massive. The infrared divergence in the one-loop diagram goes away and all calculations may proceed controllably. We must only be sure to correctly set a few parameters equal to their observed values.

The relevant expressions require lots of mathematical equations but the point is that if you actually do all the job needed to compute the answers to the full-fledged "observational" questions and if you take the limited resolution of your gadgets into account, quantum field theory produces results that are finite and that agree with the experiments.

There is one important general message about the divergences: all the divergences ultimately cancel as long as you use a viable theory and as long as you use it correctly (you properly calculate quantities that can be observed). So it would be entirely incorrect to assume that the one-loop terms "dominate" just because the expression "diverges". In quantum field theory, you can never become a king by being divergent; on the contrary, you become a loser if you don't know how to get rid of the divergence. ;-)

The divergent part of any term is always unphysical and a complete enough calculation will show you that it is. On the other hand, the one-loop term is suppressed by an extra power of the fine-structure constant, 1/137.036 or so, which means that it is smaller than the tree-level contribution! In the same way, the tree-level cross section with an extra production of a soft photon has an extra power of the fine-structure constant relatively to the diagram without a soft photon. So it means that the process with an extra photon is less likely than the process with the minimum number of the external particles (e.g. with no soft photons). It doesn't matter that it was divergent.

In the previous statement, we had to redefine a "process with an extra photon" to represent a "process with an extra photon above E_resolution" while any number of soft photons below E_resolution is always allowed. A choice from the end of the previous sentence - regardless of the chosen E_resolution - is necessary for questions to be observationally meaningful.

In other words, the small coupling constant decides about which terms are leading and which terms are small corrections. Some terms' divergent character makes no difference whatsoever because the "infinite value" is ultimately unphysical when treated properly. It is extremely bad if someone gets so confused by the divergences that he can no longer correctly divide the processes to "more likely" and "less likely" (and terms to "leading" and "subleading") according to the powers of the small couplings which is what always decides.

Ultraviolet divergences

The ultraviolet divergences are different. Loop diagrams also produce integrals that are infinite because they produce expressions such as ln(E_max) or E_max^n which diverge if the maximum energy allowed in the loops, E_max, is sent to infinity as it naively should be.

Now, the right interpretation and recipe is different. Here you should appreciate that at least a priori, your quantum field theory was just an effective field theory that is valid for sufficiently long distances but that really doesn't know what is happening at extremely short distances. It may break down if you extrapolate it to too short distances; but it doesn't have to break down.

So in this case, it is really legitimate to imagine that E_max (also known as Lambda) is finite. The whole theory must be supplemented with extra terms that depend on Lambda so that all the required symmetries hold (and e.g. the longitudinal gauge bosons decouple) and the observed masses and couplings match the full-fledged calculation from the theory with all the counterterms.

The machinery of renormalization group due to Ken Wilson and others reveals that the long-distance limit of big classes of short-distance fundamental theories are pretty universal and are only determined by a few parameters - such as the fine-structure constant for QED. So there exists an objective long-distance effective theory whose predictions are valid long-distance predictions for the whole class of the detailed short-distance theories.

The cancellation of the divergences may look extreme but it is completely natural if you imagine that you are actually calculating things with a theory relevant for much shorter distances. The effective theory that applies to longer distances - in which you are no longer expected to believe in the existence of excessively high-energy virtual particles (and you don't integrate over them) - is often similar to the short-distance theory but the parameters are different. The coupling constants are "running" as the function of E_max - and we also say that the whole theories "flow" to other theories as you reduce E_max.

Now, it's been known since the 1940s that the divergences can be nicely renormalized away in QED - and similarly other field theories. The physical predictions of the "observational" type (infinitely many of them!) will only depend on the fine-structure constant and the electron mass in QED - and similarly on a finite number of parameters in other theories that are renormalizable. The theory is immensely predictive. It still needs an input, e.g. the fine-structure constant, but it can predict "everything else".

Non-renormalizable theories - usually those with too complicated interactions (whose coupling constants have units of a negative power of mass) - produce infinitely many types of divergences and the renormalization cure doesn't work. In the language of the renormalization group, you may also say that the non-renormalizable theories can't be extrapolated to high energies. They only work as effective theories and the calculation of loops in non-renormalizable theories is never going to be too useful.

The infinitely many types of divergences in non-renormalizable theories mean that these theories are not predictive. The infinite number of unknown parameters may be linked to an infinite number of unknown couplings at a higher scale where the theory breaks down. This contrasts with the renormalizable theories. Those can be extrapolated to infinite (or at least much higher) energies - but such an extrapolation only works if all the other quantities are arranged properly.

In the opposite language, if you "flow" a short-distance theory to long distances, to predict something at accessible i.e. long distances, you will see that the physics only depends on a few parameters and all the details of the short-distance theory only influence the long-distance predictions by tiny terms of order "(short_distance/long_distance)^k" where "k" is a positive exponent.

The renormalizable theories may be extrapolated to much shorter distances ("short_distance") than the distances where we want to calculate the predictions ("long_distance") which is exactly the reason why the details of the short-distance physics have a negligible impact on our predictions. For non-renormalizable theories, "short_distance" cannot be too much smaller than "long_distance" because the theory breaks down (and has to be fixed by new terms or fields) too early. That's why those corrections from the unknown physics are of order 100% and the effective theory is not predictive.

If you want a "truly complete" theory that works to arbitrarily short distances, the ultraviolet divergences are signalling a genuine problem with your theory (unlike the infrared ones). To find a theory that works at arbitrarily shorter distances i.e. higher energies (or at least much higher energies than your UV-divergent, non-renormalizable theory), you need to "UV-complete" your sick theory. It means to find a more refined theory that reduces to the sick one at long distances but that is not sick at short distance by itself.

If you only care about perturbative expansions for the cross sections, most useful theories whose couplings are either dimensionless or that have units of a positive power of mass are renormalizable and OK. That includes all the types of theories in the Standard Model - gauge theories with dimensionless couplings, whether they're Abelian or not; Higgs field with a quartic self-coupling; Dirac or Weyl or Majorana fermions with arbitrary charges under the gauge group (i.e. gauge couplings) and arbitrary Yukawa couplings (interactions with the scalar fields).

However, if you care about the complete consistency - of the full amplitudes and not just their "up to all orders" perturbative approximations - up to arbitrarily high energies, you will encounter a problem e.g. with QED, too. The fine-structure constant is still running, although its magnitude is only "drifting" with the logarithm of the energy scale. At some exponentially high energy scale - that you may morally visualize as "exp(137.036) times electron mass", although there are extra coefficients everywhere - QED breaks down, anyway. It has a Landau pole: the fine-structure constant diverges if extrapolated to the huge energy scale. The same occurs with theories with quartic self-couplings of scalar fields etc.

In practice (I mean practician's practice), this Landau pole is not a real problem because the scale "exp(137.036) times electron mass" is much higher than the Planck scale, so obviously many things such as quantum gravity (and even much more mundane phenomena such as the electroweak unification) will modify QED long before you get to the Landau pole. But from a theoretical viewpoint, not thinking about the extra modifications of physics that exist in the real world, the Landau pole is a problem, too.

While it doesn't affect the consistency of the perturbative expansions, not even at the 100th order in the expansion, it is an inconsistency if you want your theory to be accurate. In some qualitative sense, it is the same inconsistency as the theory's being non-renormalizable: just the energy scale where the theory breaks down is much higher.

The latter analogy may also be explained by the dimensional analysis. I said that theories with dimensionless couplings are marginally OK but theories with couplings whose units are negative powers of mass are non-renormalizable (because the interactions grow stronger and out of control at higher energies). The fine-structure constant is classically dimensionless. But the logarithmic running I have mentioned implies that the fine-structure constant is actually "slightly" dimensionful - you should imagine that its actual units, when the running is taken into account, is something like mass^{-1/137.036}. So it's slightly on the "wrong side" - the same side as couplings of non-renormalizable theories.

On the contrary, QCD's strong coupling constant is getting weaker at shorter distances. That's why quarks move almost freely inside protons - they experience the asymptotic freedom, the main physical feature of QCD that made Gross, Wilczek, and Politzer so certain that they were right and that could have earned them their Nobel prize 3 decades later.

The fact that the QCD coupling is running weaker also means that the coupling has a dimension that you may visualize as mass^{+1/30} where 1/30 is just a placeholder for the "strong fine-structure constant" - its exact value is scale-dependent because unlike the QED fine-structure constant, it never stops running. (The QED coupling or the fine-structure constant remains almost constant for energy scales below the mass of the lightest charged particle, the electron/positron, because all the loops with charged particles inside become negligible.) At any rate, the dimension is similar to the dimension of mass terms etc. and it keeps the theory renormalizable.

QCD is, in fact, fully consistent at arbitrarily short distances. If there were no gravity, QCD could be the "ultimate theory of everything" because it has no inconsistencies or physical limitations. Pure QCD has no dimensionless parameters, either.

In fact, in some sense, we have known for 13 years that the statement above is morally true even for the world that does contain gravity. It's because QCD-like gauge theories actually also describe the full theory of quantum gravity - particular vacua of string theory - as the AdS/CFT correspondence shows. Well, the space with gravity is curved and has an extra dimension but it is "continuously" connected with the backgrounds with the right number of dimensions, too.

So even ordinary quantum field theories may be perfectly finite, perfectly consistent, and able to include quantum gravity in the framework. How is it possible that mundane old-fashioned quantum field theories may achieve such a dream? Well, it because one can show that those "gravity containing, perfectly consistent QFTs" are actually 100% equivalent to string theory's description of particular sectors of string/M-theory.

Once your theory is shown to be exactly equivalent to a glimpse of string theory, it immediately receives the 100% A1*+ perfectionist quality stamp. If that occurs, your theory is not yet telling us everything - everything about all of string theory - but you may forget about all conceivable consistency problems when it comes to your theory's own domain of validity. It turns out that quantum field theories are equivalent to sectors of string/M-theory - either because of AdS/CFT or because of Matrix theory.


The previous section about the short-distance (ultraviolet) divergences has already sketched how and why renormalization works. But let me repeat the story from a different, more historical angle.

In the 1930s, people started to notice that there were divergences in the loop diagrams - higher-order contributions to observable quantities, when expanded in the fine-structure constant. In the 1940s, they learned how to deal with the divergences. They have figured out how to cancel the divergences by counterterms, or how to assign finite values to divergent integrals according to a perfectly working dictionary.

(In the 1960s, they also added the FP ghosts that are needed if your theory has gauge symmetries, but we won't go into similar newer technical developments here.)

It remained mysterious why the prescription worked because it looked counterintuitive to many physicists. For example, Paul Dirac had never accepted renormalization. He would complain that things could only be neglected if they were small, not because they were too big to fail (divergent).

However, Dirac and many others were completely wrong. From a proper physical perspective, the integrals just looked big. The full observables for which these integrals were the most nontrivial contributions were actually fully finite, and when multiplied by a higher power of the fine-structure (or similar) constant, they were negligible, indeed.

Dirac and others - and many of us before we learned QFT properly - assumed that things had to be simple and the right theory needed to produce finite results for the "observational" questions while the parameters in the "fundamental" theoretical formulation must be finite, too. So a valid theory must be able to connect these two sets of finite numbers. That's how it worked in all theories of physics up to the non-relativistic quantum mechanics.

However, as in the case of many other incorrect dogmas, one half of this assumption is simply incorrect and unjustifiable.

It is not true that the "theoretical parameters" that enter the basic equations of QFTs must be finite. Quite on the contrary, they are "divergent" (the total coefficients in front of various terms in the Lagrangian become divergent if you send E_max to infinity). But the divergences that are "inserted" into the very classical Lagrangian perfectly cancel against the divergences that are produced by loop diagrams even from finite couplings. And the resulting observable cross sections and probabilities are finite.

And indeed, the finiteness of the observable quantities is the only condition that must genuinely hold. Again, the finiteness of the parameters in the bare Lagrangian doesn't have to hold - and in fact, it doesn't hold.

There exist various ways how to describe the "divergent nature" of the parameters of the theoretical, bare Lagrangian. You can make them finite but dependent on a high-energy scale E_max that you would like to send to infinity - but you can't if you want to keep things finite. So the bare Lagrangian will depend on an arbitrary scale E_max. That's something you could have thought that shouldn't occur because E_max is unphysical - there is no universal measurable value of E_max for QED.

But even if no "minimum length scale" is observable in QED, there's actually no reason why the parameters in the Lagrangian should be independent of a quantity named E_max. The only real condition that is logically justifiable is that the observed quantities are independent on E_max - and indeed, they are. The renormalization group explains why the ultimate sufficiently long-distance predictions are independent on E_max - as well as many other detailed properties of the extreme short-distance theory that completes QED. That's true despite the fact that we have to make the fundamental parameters in the Lagrangian (bare couplings) E_max-dependent.

People often make many naive assumptions and Nature is teaching us that many of those assumptions are actually unnecessary - especially because they are just plain wrong. People who want to understand the logic of Nature must always be able to jettison the excess baggage with their previous incorrect assumptions that were never justified by anything except of bad habits, delusions, and groupthink.

(Yes, I borrowed the words from Murray Gell-Mann because they are so crisp.)

People are naturally so naive and they have been formed by millions of years of life in conditions that are so extremely constrained and so isolated from so many important fundamental phenomena in Nature that they often make profoundly incorrect assumptions.

But one can only use advanced science properly if he is able to separate irrational dogmas, emotions, and preconceptions from statements that can actually be justified by rigorous mathematics whose only extra axioms are empirical observations. And be sure that none of the complaints against quantum mechanics, relativity, or quantum field theory can be justified in this scientific way.

Haag's theorem

Another technical "tool" that critics of quantum field theory often incorrectly use to argue that there's something wrong about quantum field theory is Haag's theorem. This is a theorem about the ambiguous identity of the Hilbert space - something that only works in quantum field theory because it has infinitely many degrees of freedom.

I want to genuinely explain you why QFT is inevitably different than ordinary quantum mechanics models for a finite number of particles when it comes to the "uniqueness" of the Hilbert space. Needless to say, the difference arises because QFTs have an infinite number of degrees of freedom. The state vector is not a "wave function" anymore; it is a "wave functional".

Let's start with a quantum mechanical model whose wave function is "psi(x1, x2, ... xN)". Take any prescription for your Hamiltonian that is sufficiently non-singular. And take any L2-normalizable wave function psi (the square of the absolute value of psi has a finite integral). Unless you prepare your "psi" and "H" pathologically enough - and be sure that I know how to do it (i.e. psi(x)=x/(x^2+1) in the harmonic oscillator) - the expectation value of "H" in "psi" will be finite, too.

So for quantum mechanics with a finite number of degrees of freedom, the space of finite-energy states is pretty much independent of the Hamiltonian.

However, for theories with infinitely many degrees of freedom, this is not the case: the Hilbert space of finite-energy states (or "wave functionals") depends on the Hamiltonian. Haag's theorem says pretty much nothing else than that - and it's easy to see why it's true.

Instead of quantum field theory, think about a simple model of infinitely many decoupled harmonic oscillators:
H = sum (i=1...infty) Hi
Hi = pi2 / 2m + k xi2 / 2
A free quantum field theory Hamiltonian may be written in a similar way. Now, prepare your wave functional as the tensor product of the actual ground state of each i-th Hamiltonian. The total energy will be the sum of zero-point energies, It's infinite but the universal additive shifts to energy are unphysical (in the absence of gravity) and you may subtract the infinite term.

However, if you compare this ground state of the infinite-dimensional harmonic oscillator with other states, you can easily see that the energy differences will be infinite. For example, if you take the wave functional to be the tensor product of the Gaussians with a wrong width (the same wrong width for each partial oscillator), the expectation value of the energy of the i-th oscillator, E_i, will differ from Because there are infinitely many oscillators, the total energy difference will be infinite (infinity times (

So unless you prepare your wave functional very carefully, you will obtain a state whose energy is "infinitely higher" than the ground state energy. Clearly, such states are experimentally inaccessible from another state of the Universe with a finite energy (above the true ground state). Also, you may show that the (normalized) "right width" infinite-dimensional Gaussian wave functional is orthogonal to the (normalized) "wrong width" Gaussian because the inner product is the product of infinitely many copies of a number smaller than one.

The lesson of this trivial talk is simple: if you have infinitely many degrees of freedom, you must be very careful which states actually have a finite energy (above the true ground state) - and only those states are physically relevant. So the relevant Hilbert space - or the relevant "representation of the commutation relations", as the Haag's theorem fundamentalists will say - also depends on the Hamiltonian, i.e. your prescription for the energy. It wasn't the case in typical applications of finite-dimensional quantum mechanics.

There's nothing shocking about it and nothing about it makes QFT internally inconsistent, inconsistent with observations, or ambiguous. It's just a lesson that one must be careful that even the very identity of the relevant Hilbert space depends on the dynamics - on the Hamiltonian or anything that plays its role (the action etc.).

(By the way, the "natural basis" of the Hilbert space that you may measure (in the sense of the "interpretation of quantum mechanics") also depends on the Hamiltonian: the Hamiltonian determines which states decohere from others. That's a fundamental reason why a predetermined set of "beables" that can be observed - like the "beables" in any kind of Bohmian mechanics - is just a fundamentally misguided approach to quantum mechanics. All such things are determined by dynamics, i.e. by the Hamiltonian if it exists. You must leave all such decisions to Nature and the potentially complex processes determined by the Hamiltonian instead of trying to dictate the right bases and other choices to Nature.)

There are closely related issues in quantum field theory - or any theory with infinitely many degrees of freedom. One of them is the existence of superselection sectors. Even if you consider states with a finite energy, you can still find many Hilbert spaces - or subspaces of the full Hilbert space, depending how inclusively you define the latter - that are completely decoupled from each other.

Again, it's because the system has infinitely many degrees of freedom. For example, there may exist scalar quantum fields PHI(x,y,z,t) whose potential energy is exactly zero. So any value of PHI is as good a point for expansions as any other value. In particular, the value of PHI at infinity (in the (x,y,z) space) may converge to any allowed value you want. For each of them, there will exist a different Fock space including excitations of all the other particles - whose masses (and widths of the wave functionals) will depend on PHI in general.

Those spaces will be physically disconnected because if you start in one of those spaces, the asymptotic value of PHI is a given number. A finite object will never be able to change the value of PHI in an infinite region of space - the asymptotic region at infinity - so you may be sure that no evolution can ever change a state in one superselection sector (subspace of the Hilbert space built assuming a particular asymptotic value of PHI) to another superselection sector. Again, the inner products between any pair of states from two different superselection sectors will vanish for the same reason as I discussed in the context of the infinite-dimensional oscillator.

There is absolutely nothing mysterious about these things - they directly follow from the fact that the space is infinite but finite objects can't change infinitely many things at once. There is absolutely no need to "fix" quantum field theory because of this simple observation. Of course, if there are inequivalent superselection sectors, in order to describe your initial conditions, you also have to specify the superselection sector into which your state belongs.

But the existence of superselection sectors really makes your calculations simpler, not harder. It's because they tell you that the evolution will never drive you into another superselection sector. You may still imagine that the Hilbert space is the direct sum of all superselection sectors - except that the Hilbert space relevant for all your predictions is much smaller. It's just one superselection sector.

I think that the people who see a problem with all quantum field theories because of Haag's theorem are just demagogically using a theorem that has an anti-QFT "flavor" when interpreted incorrectly - and they do so because this theorem is a stick used to beat a dog that those people dislike because of many other stupid preconceptions (or because they want to demagogically boost their crackpot pet theories - greetings to Vladimir and others). Again, there is absolutely nothing inconsistent, paradoxical, or ambiguous about the QFT predictions that would be implied by Haag's theorem.

Divergence of the perturbative expansion

The last point doesn't fully belong to this article because the divergence of the perturbative expansion may indicate genuine problems with a quantum field theory - even though it usually doesn't.

Imagine that you have a quantum field theory and you calculate its predictions perturbatively - i.e. as a Taylor expansion in the coupling constant(s). You learn how to deal with the infrared and ultraviolet divergences, in an order-by-order fashion, and you learn how to do the renormalization etc.

Your cross sections etc. will then be expressed as a Taylor expansion in the coupling constant
Sum(n=0...infinity) cn gn
Here, "n" is the exponent and "g" is the coupling constant. By some combinatorics, one can see that for very high powers of "n", the number of diagrams grows "factorially" with "n", so that "c_n" behaves like "n!" (times less variable functions of "n") for extremely high values of "n". Note that for very large "n", "n!" grows faster than any expression of the type "u^n" or even much smaller "n^u" for a fixed "u".

This means that the radius of convergence of the Taylor expansion above is zero. That's true both in field theory and string theory - and the parametric dependence is pretty much universal as long as you map "g_{closed}" in string theory to "g^2" in the field theory etc.

The vanishing of the radius of convergence may be seen in various intuitive ways, too. For example, QED would produce an unstable vacuum if the fine-structure constant were negative. The like charges would attract and one could lower the energy of the vacuum by dividing it into two big and separated regions with huge positive and negative charges, respectively. The electrostatic energy in each clump would be large and negative.

The inconsistency of QED for infinitesimally small but negative values of the fine-structure constant actually implies the existence of non-analyticities in various physical observables as a function of "alpha" that are arbitrarily close to "alpha=0" which means that the radius of convergence has to be zero.

This is pretty much the case of all quantum field theories except for the finite ones - those for which all the terms of order "n!" exactly cancel (usually because of a sufficiently extended supersymmetry).

However, the divergence of the perturbative expansions doesn't imply that there's no "exact" answer. After all, e.g. pure QCD may be defined on the lattice and all results are finite. At the same point, one can prove that up to any order in the perturbative expansion, the perturbative prescription exactly reproduces the exact (e.g. lattice) result.

In practice, we are often uninterested in the non-perturbative physics - the physical phenomena that can't be seen by making the Taylor expansions in "g". For example, the fantastic 14-digit agreement of QED's prediction with the measured magnetic moment of the electron is a purely perturbative result. QED's perturbative expansion works so well because "alpha" is so small.

In theory, you should be interested in non-perturbative physics but non-perturbative physics goes "beyond" the perturbative expansions: it can't really deny them completely. Like in other cases, non-perturbative physics is a "more accurate and complete" theory that must swallow the previous approximations, in this case perturbative approximations, as a limit. There are many new things one has to learn in the context of non-perturbative physics of quantum field theory - and string theory. In particular, string theory has made an amazing progress in the understanding of its non-perturbative structure since the second superstring revolution ignited around 1995. Mentally (or morally) defective critics of string theory who still talk about string theory's being "just perturbative" apparently haven't noticed but the bulk of the important advances since 1995 was all about becoming able to prove what's going on non-perturbatively.

However, when you swallow all the surprises, you will still see that the non-perturbative insights are fully compatible with all the perturbative conclusions as long as you approximate the exact physics by the Taylor expansion in the standard way. For example, one has to understand that there are various functions such as
f(g) = exp(-1/g2)
with the extra comment that "f(0)=0" to make it continuous. It's not hard to see that this function is nonzero for any nonzero "g". On the other hand, any n-th derivative of "f(g)" evaluated at "g=0" vanishes. It's because by taking new derivatives, you only produce new ratios of polynomials in "g" in front of the exponential. But the exponential beats all the ratios of polynomials in the "g goes to zero" limit.

So the perturbative expansion of "f(g)" around "g=0" is zero even though the function is not zero! This is a typical form of non-perturbative contributions to physical quantities - you get similar terms from the instantons.

Quite generally, it's true that the "leading instanton" contributions of this kind are parametrically of the same order as the minimum uncertainty you get from resumming of the perturbative expansions. Recall that the Taylor expansions diverge. For a small value of "g", the terms are getting smaller and smaller because of the higher powers of "g", but eventually the factorial growth of the "c_n" becomes more important than the decrease of "g^n" with "n" and the terms start to increase again.

If you sum the Taylor expansion up to the minimum term, you will be uncertain about the result because you don't know whether you should include one more term "after the minimal one" or not. In fact, the minimum term - the estimate of the uncertainty - is of the same order as the leading non-perturbative instanton-like corrections.

This finding may be interpreted by saying that the divergence of the perturbative expansion is just an artifact of the negligence of the non-perturbative physics. If you did a more complete calculation that takes all the non-perturbative physics into account, it would also fix all the conceivable ambiguities that you may see in the (divergent) perturbative expansion. To say the least, the previous paragraph shows that the non-perturbative terms are big enough to cancel the ambiguities.

Once again, a nicely organized resummation of the perturbative expansion will make the result depend on the way how you truncate (or otherwise regulate) the perturbative expansions. But there will also be analogous choices in the quantification of the non-perturbative, instanton-like terms. When you sum both of them, the ambiguities just go away. This cancellation is somewhat similar to the cancellation we encountered in the case of the infrared divergences except you don't have to modify your question because of the divergences of the perturbative series.

For any quantum field theory that may be, at least morally, visualized as a limit of a lattice-like description or another regulated definition, all such hypothetical problems and divergences are guaranteed to cancel. A major consequence is that if "g" is much smaller than one, the perturbative expansions may always be trusted as an excellent approximation and the instanton-like terms remember the magnitude of the error we may produce in this way.


To summarize, all the general criticisms of quantum field theory as a framework are artifacts of a deep misunderstanding by the critics.

Aside from gauge anomalies that I haven't discussed in this article (and that can be interpreted either as a UV problem or an IR problem that only depends on the massless/light spectrum), the only genuinely physical problem of some quantum field theories is that they're not UV-complete - they're just effective theories for long enough distances. However, their UV-completion may be either another quantum field theory or a superselection sector of string theory. Those UV-completions have no inconsistencies whatsoever.

And that's the memo.


  1. On a more mundane note, you wrote "...over arbitrarily short-wavelength (or low-energy)..." Did you mean long-wavelength

  2. Hi Lubos.

    After all of that text, I wonder, when has QFT recovered
    F = q E ?
    Should this not be recovered then I do not believe that QFT is valid. Should this be relied upon within QFT then a fundamental conflict within the theory exists.

    I make these statements firmly, but am open to falsification. If you have some thoughts on this I'd greatly appreciate your feedback.

    - Tim

  3. Dear Timothy, your concern is almost amusing. Even in classical field theory, F = qE follows from E=-gradient(phi), and the phi.q term in the energy.

    The very same term appears in the Hamiltonian of Quantum Electrodynamics, so of course the same physics results from it, too. You know, the correct classical limit is completely manifest.

    If you find the classical limit important, QFT is just the very same field theory with extra "hats" added above all observables, and the proof that the quantum theory reduces to the classical theory without hats in the classical limit - i.e. in macroscopically large electromagnetic fields - is totally straightforward.

    Best wishes