Thursday, March 29, 2012

All interactions in the Universe

This essay is a natural continuation of Masses of all objects in the Universe. However, to classify all forces in the Universe isn't quite as easy and straightforward as to enumerate all allowed eigenvalues of \(m^2\) – and even the latter task had a lot of subtleties, as we have seen.

Ancient history

It seems to me that almost all thinkers in the ancient civilizations thought that pairs of objects may only interact with one another if they're in contact. That's also why they were inventing many kinds of fictitious materials where the influences propagated whenever it looked like the interaction occurred via action at a distance.

For example, many ancient civilizations would invent various breeds of aether. Later, this would be refined as the luminiferous aether. Some not-so-modern physicists were imagining "philogiston", a material that is equivalent to heat and that needs to spread when heat does, and so on.

From some perspective, their viewpoint was modern because "locality" is indeed a fundamental property of our Universe according to all theories respecting special relativity. However, their approach was also very non-modern because the assumed properties of the quasi-materials that mediated the influences were too similar to common materials. It was needed to get rid of the idea of locality of interactions for a while in order to make progress.

Isaac Newton

And of course, it was Isaac Newton who invented and exploited the concept of action at a distance. He was able to unify celestial and terrestrial gravity by postulating a gravitational force,\[

\vec F = \frac{Gm_1 m_2}{r^2}\hat r

\] which acts immediately and contributes some acceleration to any object depending on the positions of other objects, according to \[

\vec F = m\vec a.

\] Newton, convinced that the space is filled with the holy ghost or spirit and all transactions have to involve Him or Her or It, had psychological problems with the action at a distance but he was able to see that it's a remarkably useful concept, anyway.

And be sure about it: it was very important. One of the advantages of the action at a distance is that it is mathematically very clean. When interactions have to occur through direct contact, they tend to be messy, dirty, and sometimes even sinful. But a force from distant bodies may influence a local body very predictably; the body doesn't even get dirty by such Platonic relationships.

People figured out many other forces with similar or different formulae, from the analogous Coulomb electrostatic force to forces between bar magnets. But they also began to analyze forces that do require some contact more accurately. Hydrodynamics, aerodynamics, mechanics of solids, and so on. All these things exploded and this explosion would be impossible without Newton.


As the description of materials through partial differential equations was getting more realistic and more sophisticated, people became able to reinvent the concept of the field, a refined mathematical version of the old aether, holy spirit, or whatever invisible mess is filling the space. The electrostatic or magnetostatic interaction wasn't "direct" anymore; one charge or magnet influences the field and then the field influences the other charge or magnet.

One needs two steps and the intermediate player, the electromagnetic field, became a "real entity" despite its apparent invisibility. Its properties were understood ever more accurately. James Clerk Maxwell was lucky enough to be the man who wrote down the final and complete version of the equations obeyed by the electromagnetic field. His equations allowed the electromagnetic field to wiggle and have waves in it that can survive independently of the charged "material" sources.

People still believed that the field had to be a set of mechanical properties of a material similar to water or the air; light had to be fully analogous to sound, they thought. Because they realized that light was a type of an electromagnetic wave, they coined the term "luminiferous [light-carrying] aether" for the fictitious material. Its mechanical properties had to be unusual but most physicists didn't care and believed that this counterpart of the air was necessary. Some famous physicists were even constructing a model of the aether out of wheels and gears.

However, it wasn't a material. In 1905, Einstein's special relativity finally established that the aether wasn't any real material. In particular, the electromagnetic fields and waves could exist in the vacuum – a "material" that differs from other materials at least in one paramount respect. The vacuum doesn't pick a preferred reference frame. You can't say what's your speed relatively to the vacuum even though you can always discuss the aircraft's speed relatively to the air or a swimmer's speed relatively to the water.

In general relativity, Einstein showed that the vacuum was also able to carry the gravitational field. The dynamical variables describing the gravitational field are encoded into the local curvature of the spacetime etc.

Quantum revolution

Around 1900, Max Planck and others also realized that the electromagnetic field had particle-like properties. The energy carried by the electromagnetic waves of frequency \(f\) has to be a multiple of\[

E = hf = \hbar\omega,

\] the energy of one photon, using the simple modern terminology. So even though fields were introduced just as convenient messengers and auxiliary entities to exchange the influences between charged and other material objects – those that surely matter in physics – the field allowing these influences may be thought of as a collection of particles. In this sense, physics returned to the idea of a "spirit" or "aether" that must be in between.

However, it's important that electromagnetic or gravitational waves may propagate through the empty space. One doesn't have to start with a messy collection of atoms. Even if you imagine that the vacuum – the material that allows or carries electromagnetic waves (or photons) or gravitational waves (or gravitons) – is a material, it's a very clean material. It carries no detailed messy information and its entropy density is zero. In fact, it doesn't even carry any information about a preferred reference frame because the vacuum is a Lorentz-invariant state. If the "messiness" (high entropy) and the "rich options to choose a microstate" or a "preferred rest frame" belong to a necessary defining condition of a material, then the vacuum isn't a material.

The modern picture paints the vacuum as "something in between" the ancient pictures in which the objects may influence others "directly"; and the picture that requires a "material in between" for the interactions to occur. Moreover, quantum field theory shows that every field force has a particle (quantum of the field such as the photon) and every particle is associated with a field that influences the interactions between other particles (e.g. because of virtual electrons, quanta of the Dirac field).

Lagrangian of quantum field theories

Quantum field theory has unified bosons and fermions; in some rough sense, it has unified particles and forces. Fields transmit forces but they always have their quanta, particles. And vice versa: particles always arise from some fields. In this picture, the free particles are fully described by the quadratic part of the quantum field theory Lagrangians such as \[

\LL_{\rm free} &= -\frac{1}{4} F_{\mu\nu}F^{\mu\nu} +\\
&+ \frac{1}{2}\partial_\mu h \partial^\mu h -\frac{m^2}{2}h^2 +\\
&+ \bar\Psi (i\partial^\mu \gamma_\mu - m)\Psi \dots

\] and so on. This Lagrangian is bilinear in the fields such as the electromagnetic potential \(A_\mu\), the God field \(h\), or the Dirac field \(\Psi\). Consequently, the field equations are linear in these fields; their solutions obey the superposition principle and allow the particles to propagate along straight paths.

It's not mathematically "inevitable" to separate the bilinear terms in the Lagrangian from others. The third or fourth power isn't "qualitatively different" from the second power. Nevertheless, it's still very useful to separate the terms into the bilinear ones and others whenever it's possible because the bilinear terms describe the "free particles and fields" that just propagate through empty space, paying no attention to others.

We have to add the interacting terms as well.\[

\LL_{\rm total} = \LL_{\rm free} + \LL_{\rm interactions}

\] The interaction terms contain all the products of fields, their powers, and their derivatives that are not bilinear in the fields. When you treat the quantum field theory by the method of Feynman diagrams – which Feynman originally derived from his path-integral approach to quantum field theory although they may be derived from the operator approach as well – the free Lagrangian produces the propagators, the links in the Feynman diagrams, while the interaction Lagrangian allows the Feynman diagrams to include the vertices in which at least three legs meets.

(When doing renormalization that is needed to get rid of infinities from the loop diagrams, one also needs counterterms. Counterterms in the Lagrangian may be quadratic or linear or absolute but they're still treated as interaction terms.)

In a recent article, I described all particles and fields of the Standard Model and the Minimal Supersymmetric Standard Model, two of the minimal models that are compatible with certain symmetries and a certain field content whose existence has already been established. But I've never discussed interactions.

First, let me mention that the quadratic part of the Lagrangian of theories similar to the SM or MSSM may be brought to the canonical form in which the Lagrangian for every \(j\in\{0,1/2,1\}\) field is pretty much the canonical Klein-Gordon, Dirac/Weyl, or Maxwell/Proca Lagrangian. We often want to know how some important symmetries (e.g. gauge symmetries) act so we need to know the relationship between the fields for which we can write down the simple canonical free Lagrangian on one hand; and the fields that are the "partners related by symmetries" to some other simple canonical particles and fields. To preserve this information, people need to talk about the "mixing matrices".

In the neutrino sector, it's the PMNS matrix; in the quark sector, it's the CKM matrix. Weinberg has figured out how to mix the neutral spin-one gauge fields into the photon field and the Z-boson field; the MSSM requires us to discuss lots of mixing and redefine the superpartner fields as neutralinos, charginos, and others. So the quadratic part of the Lagrangian isn't quite trivial conceptually; we need to know not only the masses of the particles but also the matrices defining various "mixings" whenever particles and fields may mix with each other.

Renormalizable interactions in \(d=4\) QFT

But the number of possibilities typically increases when we discuss true interactions, the terms in the Lagrangian which are not quadratic (bilinear). For many variables, there are many more monomials of higher degrees than the number of bilinear monomials.

The "renormalizable" Lagrangian – one that leads to calculations in which the infinities may be cancelled by specifying at most a finite number of parameters – have to be power expansions and the powers can't grow too high. How does it work?

Imagine that you really consider the free Lagrangian to be the "defining part" and the interactions are a "cherry on the pie". It's not inevitable but it's allowed to treat the theory in this way and this treatment is very useful. It means that we may use the free Lagrangian to determine the units of all the fields.

The action is a dimensionless quantity e.g. because it appears in Feynman's path integral,\[

{\mathcal A}_{i\to f} = \int {\mathcal D}\phi~\exp[iS(\phi)].

\] I have written \(iS\) and not \(iS/\hbar\) in the exponent because I assume that the dear reader is already an adult physicist who uses units with \(\hbar=1\). However, the action is written as an integral of the Lagrangian density,\[

S = \int d^4x~\LL

\] for our usual \(d=4\) spacetime, so you see, if we also set \(c=1\), that the Lagrangian density \(\LL\) has units of \({\rm length}^{-4}={\rm mass}^4\). When we know the units of the Lagrangian density, we may deduce the dimensions of the fields from their kinetic terms. Bosonic fields have kinetic terms of the schematic form\[

\LL \sim (\partial h)^2 + \dots + (\partial A)^2

\] so you see that both \(h\) and \(A_\mu\) are naturally assigned the units of \({\rm mass}\). You combined two factors of masses from the fields and two factors from the two derivatives – one needs two derivatives to be contracted in order to get a Lorentz-invariant quantity – to get the \({\rm mass}^4\) dimension of the Lagrangian density.

For fermions, the kinetic terms look like\[

\LL \sim \Psi \partial \Psi

\] which only contains one derivative. For spinors, it's possible (and easier) to get the Lorentz-invariant scalars with one derivative only. This makes some difference; \(\Psi^2\) must have the units of \({\rm mass}^3\) which means that the Dirac field itself – and similarly for other fermionic fields – has the dimension of \({\rm mass}^{3/2}\).

That's great. Now assume that the Lagrangian has some term schematically of the form\[

\LL = C \partial^K h^L A^M \Psi^N

\] with some (probably integer-valued) exponents \(K,L,M,N\). The coefficient \(C\) has to have the right units that cancel the units of all the other factors – bosonic and fermionic fields and derivatives – for the product to be \({\rm mass}^4\). In other words, the units of \(C\) are \({\rm mass}^{4-K-L-M-3N/2}\). An essential point I want to say is that this is not allowed to be a negative dimension of mass i.e.\[

4-K-L-M-\frac{3N}2 \geq 0.

\] Why? If \(C\) had the units of a negative power of mass, the 6,000-loop diagrams would be proportional to something like \(C^{6000}\). This would have the dimension of a huge negative power of the mass and would have to be multiplied by some huge positive powers of mass, such as \(E_{\rm loop}^{9000}\) where the base is the energy of the particles that run in the loop. In effect, the loop diagrams would inevitably be integrals containing huge positive powers of the energies running in the loop. Such integrals would diverge in the "ultraviolet" i.e. the region near very high values of the energy. They would diverge more brutally whenever you would add loops. You would produce new kinds of divergences and you would have to define a new value for each of these infinitely many types of divergences.

That's why the dimensions of the coefficients aren't allowed to be negative powers of the mass if you want the theory to be renormalizable – i.e. if you want its predictions to be completely universal and well-defined once you decide about the value of a finite number of parameters (related to tree-level and/or a few-loops quantities).

What are the allowed values of \(K,L,M,N\) for which we satisfy the inequality? Note that \(N\) is always even because the Lagrangian density is bosonic and the number of fermionic factors therefore can't be odd. That's why \(3N/2\), the term in the inequality, is actually integer-valued.

We only want to discuss possible values of \(K,L,M,N\) which yield interacting i.e. at least cubic terms, i.e. \(L+M+N\geq 3\). It makes it clear that the difference \(4-K-L-M-3N/2\) is at most equal to two. And because it has to be non-negative and integer, it can only be one or zero.

When it's one, the coefficient \(C\) has the units of mass which combine with the cubed mass to get \({\rm mass}^4\) for the Lagrangian density. You get terms like \(h^3\), cubic interactions of scalars, or \(A^3\), cubic interactions of gauge fields (prohibited by gauge symmetry), or \(h \partial A\) which is bilinear but not included among the usual free terms.

All the terms that contain the gauge fields \(A\) are heavily constrained by the gauge symmetry which is needed to get rid of the bad ghosts (negative-norm time-like polarizations of the excitations which would produce negative probabilities for some processes). So they're typically fully determined by the charges of the other fields that appear in the product with \(A\). In Yang-Mills theory, the gauge fields themselves are charged so these cubic terms have to occur with coefficients fully dictated by the structure constants of the Lie algebra.

Finally, you may have \(4-K-L-M-3N/2=0\). In this case, the coefficient of the term is (classically) dimensionless; we talk about "marginal" interactions or operators. These are the "most important" types of interactions, in a sense. You get the possibility \(K=4\) and the rest is zero, i.e. the quartic terms \(h^4\) for the scalar fields. Then you get \(h^2 A^2\) and \(A^4\) and \(\partial A^3\) and \(\Psi A \Psi\) which are gauge couplings pretty much uniquely determined by the charges (or representations) carried by the fields in your theory. Finally, there is the Yukawa coupling \( \Psi h \Psi\) with two fermions and one scalar which isn't too constrained. There are many fields in the Standard Model (especially the Dirac-like fermions) and they carry various internal indices (flavor, color) so in this way, you may get lots of independent Yukawa coupling constants.

A similar analysis applies when you add many more fields. The number of independent renormalizable interactions is finite and it's not hard to list all of them. Each of the terms that preserves the required symmetries – especially the gauge symmetries and the Lorentz symmetry – can have an a priori arbitrary coefficient. When the dimensionless coefficients get close to one or higher, the multiloop diagrams become very important and the free or tree-level approximation isn't a good enough approximation of the exact physics. The theory becomes "strongly coupled" and may even become inconsistent for processes at slightly higher energies because the coupling constants "run" (have a secret, slow, logarithmic dependence on the renormalization scale that follows from quantum mechanics) and when they're large enough, they may quickly run to infinity and cause a new kind of divergences (the Landau poles).

Renormalizability from a modern perspective

In the text above, I motivated the truncation of the interactions by our desire to avoid high positive powers of energy in loop diagrams that imply bad divergences. We only need terms whose coefficients are non-negative powers of the mass if we want to obtain a predictive theory in which the multiloop diagrams just "quantitatively" affect the results but don't require us to redefine the theory from scratch every time we increase the accuracy by another loop.

That was the justification of the renormalizability in the 1940s, 1950s, and 1960s. But since the 1970s, thanks to Ken Wilson and others who have discovered the Renormalization Group, we may explain what's special about the renormalizable interactions in a better way.

In fact, we may write down non-renormalizable Lagrangians as well. However, if you study whether some processes are calculable, you will find out that only processes with external particles' energies smaller than a bound, a "cutoff", are well-behaved according to the theory. It's clear that some new physics has to be specified near or above the cutoff.

I don't want to explain this in detail here but the conclusion is that theories with renormalizable interactions allow the cutoff to be arbitrarily high. (Theories with a Landau pole like QED only allow the cutoff to be exponentially high, e.g. \(\exp(C/e^2)m_e\), but that's enough for the perturbative series to be totally well-behaved to all orders.) For any cutoff \(\Lambda\) – energy scale where we must assume that some unknown new physics kicks in – we produce corrections to the low energy processes that go like \(1/\Lambda^n\) where \(n\) is positive. Because the renormalizable theories allow us to shift this cutoff scale – with new physics and dragons – very far (to very high energies, \(\Lambda\to\infty\)), these corrections may become arbitrarily small. The higher predictivity of the renormalizable theories boils down to our ability to separate the energy scale we're interested in (some low enough energies that may be accessed by colliders) from the high energy scales where new unknown physics almost certainly kicks in (the GUT scale, the Planck scale, or some lower but still high scale).

Renormalizability allows us to ignore the dragons and unknown new physical phenomena at the high, inaccessible energy scales. If you may ignore something, it doesn't mean that there's no new physics over there! There surely is something at the high energies and if we can't experiment with it or constrain it by solid enough general principles, we can't know what it is.

The decoupling of the scales is both a good news and bad news. It's good news because we're not affected by the dragons so our routine breakfast predictions for the experiments may be done despite our ignorance about the distant, inhuman, evil, and irrelevant dragons. It's bad news because we're not affected by the dragons so during our routine breakfast experimental tests and experiments, we're not learning anything about those fascinating distant dragons we would love to meet and befriend. ;-)

It's up to you which of the interpretations is more important for you and whether you are interested in the dragons. They surely exist and they're surely inconsequential for most of your everyday life.

String theory: unique predictions at all energies

So the renormalizability seems to be a matter of "convenience". If we believe that Nature is described by theories that may be uniquely extrapolated to substantially higher energies than those where we measure them, then we may focus on renormalizable theories i.e. theories that allow the cutoff to be this high. But there's nothing "truly fundamental" about the renormalizability of the theories. If we were totally unbiased, we should also admit that our theories do need to be respecified immediately behind the corner and they may be supplemented by new stuff at slightly higher energies than we can access. And there's no reason for renormalizability in these circumstances.

Well, I personally believe that the number of new dragons between the electroweak scale that has been tested and the Planck scale is very limited and finite. There is some GUT scale or similar scale, supersymmetry breaking scale, the Kaluza-Klein scale for extra dimensions, the moduli scale, and perhaps a few others (which may coincide with other entries in the list) but most of the 15+ orders of magnitude in energy difference is a "big desert". One can't prove this assumption. It's an extrapolation of the history in which many theories often applied up to much higher energies than the typical physicists assumed; and it's some wishful thinking about "minimality" because too many new dragons seem unnecessary and we would like to cut their throat by Occam's razor.

Still, at the end, you want a theory that is valid at all energy scales. Renormalizable theories such as QCD may be extrapolated to arbitrarily high energy scales but they can't include gravity. Gravity inevitably looks non-renormalizable at low energies. String theory is the only known theory – and quite likely, the only mathematically possible theory – that defines physics including gravity within a consistent quantum framework that doesn't break for any values of energies.

In string theory, the fields and particles are associated with some particular energy (vibration) eigenstates of a string (or some "braney" or "bootstrapy" nonperturbative generalization of a string, although this generalization seems much less explicit but not completely unknown at this moment). And the polynomial interactions among the fields – which are approximations of the true stringy reality – arise from "pants diagrams" i.e. world sheets in which many tubes or rubber bands are merging or splitting i.e. rubber bands of more complicated topologies.

A funny feature of interactions in string theory is that the character of the interactions among the strings is pretty much uniquely determined once you decide what's your free theory is! And the resulting theory works up to all energies.

Why is it so? Free particles in quantum field theories were described by a quadratic Lagrangian and yielded propagators in Feynman diagrams. There were no vertices; to get interactions, one has to manually add cubic and higher-order terms into the Lagrangian. And there were many options. In fact, we only get theories with a finite number of parameters (renormalizable theories) because we want to assume that there's no new physics above an energy scale but this assumption seems to boil down to a wishful thinking or convenience, not a fundamental principle.

In string theory, it's different. The propagation of free particles – free strings in this case – is described by cylindrical and similar simple world sheets. A funny thing is that one may produce pants or sweaters etc. out of the same "clothes" just by allowing the topology of the clothes (world sheet) to be arbitrary. A local neighborhood of a point on the world sheet is always the same, satisfying the same local dynamics. So the very freedom we give the world sheets (clothes) that they may change the topology means that there will be processes in which the number of strings increases or decreases and all these processes will have calculable probabilities. We don't need to specify any new information about the "allowed interactions of strings". We just need to admit that of course, the strings are allowed to end up with any topology that their dynamics allows.

(The only uncertain parameter is a multiplicative constant, the string coupling constant, we must add for every extra "handle" i.e. complication of the world sheet topology. But one may prove that its value is related to a dynamical field arising from strings in a particular spin-0 vibrational pattern, the dilaton, and realistic stringy vacua have a nontrivial potential for the dilaton that picks a preferred dilaton value that minimizes the potential.)

This is one of the ways to explain the remarkable uniqueness of string theory and the absence of continuously adjustable dimensionless parameters in string theory. It's also true that string theory has many solutions, including many inequivalent solutions that look like an empty Minkowski or AdS or dS space but that differ by different shapes of the extra dimensions, generalized electromagnetic fluxes through them, including some discrete ones, and the numbers of various branes wrapped on various "cycles" in these hidden dimensions.

But all these possibilities are solutions to the same laws of physics. If a theory or a set of equations has many solutions, it doesn't mean that there's something ill-defined or unscientific about it. Molecular physics allows billions of different molecules which don't make molecular physics or quantum physics unpredictive. Quite on the contrary, the large number of solutions is a prediction, a fact we are learning from the theory regardless of our ability to create all these molecules. This multiplicity isn't an assumption or complication; it's an outcome. So we can't rightfully say that the theory predicting this outcome is "contrived" because we're not making any artificial assumptions. Some of the solutions may still be more relevant for our world and its cosmological history; but the others do exist or "are allowed to exist" as well.

One may have prejudices but a scientist should avoid prejudices that he's not willing to abandon even if a sufficient amount of evidence accumulates. And physics has accumulated a huge amount of evidence that the right and accurate theory underlying particle physics and quantum gravity actually admits many solutions. A troll may complain against this fact, spread lies, and transform other people into similarly worthless and brainwashed trolls – but that's the only thing that such a troll may do against the facts of Nature.

Meanwhile, string/M-theory tells us the deepest available – and probably deepest mathematical possible – story about the origin of particles with particular spectra and their interactions. Nevertheless, quantum field theory seems to be a universally useful approximation of string theory for all long-distance experiments we can actually perform as of today.

And that's the memo.

No comments:

Post a Comment