The story will probably require the reader to understand physics at an advanced undergraduate level but anyone can try to penetrate through it, anyway. On the other hand, I don't expect professional physicists to learn something materially new here but it is plausible that the organization of the ideas is different than how they think about it.
Pre-scientific image of the world
Before the era of Galileo Galilei, philosophers would talk about infinite things quite often. The world seems to be larger than our houses. In fact, it is much larger. It makes sense to approximate the world by something that is larger than anything you can visually imagine in a compact region of your brain: the world is (almost) infinite. The same concept has led to the Flat Earth paradigm and many other notions.
If we imagine that the Universe is infinite, we avoid a lot of inconvenient, a priori meaningful questions such as "what is there at the end of the world". Because of this reason, many people couldn't even imagine that the Universe could have been finite (spatially or temporarily). Their dogma about an infinite world was wrong but let's admit that it could have been difficult to realize this fact without the Riemannian geometry and related insights.
Also, people have always been thinking about the infinitely small things. But they didn't know how physics works at distances shorter than what their eyes (or magnifying glasses) could see. So they couldn't say too many interesting things about the short distance physics and they mostly realized this ignorance (although some philosophers tried to discover how atoms looked like by pure thought: without QFT and string theory, their dream was doomed from the beginning). The world at even shorter distances was uninteresting: nothing new was happening there.
The practically oriented philosophers realized that all physically meaningful questions talk about finite numbers and additional finite numbers can be calculated by various mathematical rules (such as the Pythagorian theorem). The infinite ones are unnecessary. Nevertheless, people found it useful to construct mathematical concepts for many kinds of infinite and infinitesimal numbers. These "infinite" notions were often promoted by philosophers (who were mathematicians or amateur mathematicians at the same time).
The infinitesimal numbers became very useful in calculus and they were mathematically formalized in several ways. The infinite numbers found their place in mathematics, too. Later, they have been formalized as cardinals, ordinals, the signless infinity in the complex plane, and others. The different types of infinities in mathematics would deserve a very long text by itself: different types of mathematical questions lead you to think about the "infinity" in very different ways.
But in physics, where we're normally interested in the "continuous" types of infinities because fundamental building blocks in physics seem to be continuous (most useful gadgets to measure things generate continuous numbers - and probabilities are things that will probably remain continuous forever), the infinite values are not terribly interesting because once you say that something is infinite, all other numbers that depend on this something turn out to be infinite (or otherwise singular), too. It's much more interesting to check the number 137.036 predicted by your theory than the dull and universal "infinity".
With infinities, all the nontrivial, arbitrarily accurate information is lost. Unlike philosophers, physicists prefer finite numbers because they correspond to things that can actually be measured. Thinkers mostly realized this fact before the first dynamical laws were written down. But because the discussion of physics without any dynamical equations sounds a bit vacuous to me, let's quickly switch to the following epoch.
In classical mechanics, objects were described as point masses. They had positions and velocities. All these numbers had to be finite. The equations of motion - the first ordinary differential equations that were written by a human - evolved them into other finite numbers after any finite time. That was easy. There were no infinities and divergences in dynamics of point masses. That's why we must quickly look at a more interesting era from this viewpoint.
Classical field theory
Many point masses may combine into something that is almost indistinguishable from a continuum. The continuum behaves according to new, "effective" equations that can be derived from the laws for many point masses, assuming that the latter were complete and correct and that you know how to deal with statistics, but you don't need to know about the "atomic" origin of the laws that govern the continuum. At any rate, the ordinary differential equations from mechanics, pioneered by Newton, are promoted to partial differential equations that govern the continuum.
In the case of gases, fluids, and solids, the laws governing the continuum are derived as long as you believe that the materials are made out of atoms. But electromagnetic and related fields also use partial differential equations that are mathematically analogous to the equations of hydrodynamics. However, as has been fully appreciated only with the birth of special relativity in 1905, the partial differential equations governing the electromagnetic field should actually be thought of as fundamental equations unrelated to any point masses. The luminiferous aether was not only unnecessary: it was actually wrong.
The partial differential equations that governed the electromagnetic field - and similar equations people could have imagine - were typically evolving finite numbers (in the initial conditions) into finite numbers again. Some subtle phenomena related to turbulence may count as exceptions but these exceptions belong to the 20th century, anyway: we will discuss another major 20th century exception later. So whenever people chose "well-behaved" partial differential equations before 1900, they found no infinities. Again, nothing to talk about. We must combine the concepts:
Point masses in fields
Finally, we find some infinities here. If you describe the reality as a combination of classical fields and classical point masses - a hybrid framework that may look contrived and that is only good as a phenomenological one and we will replace it by a more "unified" picture of quantum field theory later - it is possible to find objects that are infinite. For example, if you calculate the self-energy of a classical point-like electron, you obtain an infinite amount of energy. The closer you squeeze the different parts of the elementary electric charge, the higher interaction energy you obtain. For a mass point, it is infinite.
That's terrible because the total energy of a system (including one electron) shouldn't be infinite. I guess that most physicists who have ever started with physics as kids were struggling with this problem at some point. The energy of an ordinary electron in its own electric field is infinite. It's terrible! The electrons are everywhere so this looming catastrophe surrounds us completely. The most obvious way to get rid of this problem is to try to modify the fundamental equations describing the electromagnetic field and its interactions with the charged particle.
Such a modification should disappear at long distances and it should only cure the short distance problem: it should make the electrons (or the fields around them) smoother. Indeed, it can be done. But if you think what you have actually achieved, there is a lot of uncertainty how you should exactly modify the physics at the short distances. None of the "improvements" is really canonical i.e. exceptional, qualitatively better and more sensible than the competitors. Did you achieve anything by these modifications? We will see that the newer answer to this question, based on quantum field theory, is essentially No. You shouldn't try to play these games: they're analogous to the attempts to find the best building blocks of the aether.
But there exist a similar short-distance divergence and singularity where the answer is Yes. If you study how an accelerating charged object interacts with the electromagnetic field, you find out that it emits electromagnetic waves and loses its energy. An electron should eventually fall to the nucleus, after a fraction of a nanosecond. It is also a terrible problem. Again, it's natural to try to change the laws of physics. In this case, the attempt is going to be successful: if you think properly, you will discover not only a small modification of the laws how the atomic particles emit light but you will actually discover quantum mechanics, the most profound conceptual revolution in the 20th century science.
There exists another type of divergence - the ultraviolet catastrophe - that will lead you to quantum theory, too (the term "quantum theory" refers to the application of quantum principles to fields rather than point masses). At a fixed temperature T, each quadratic degree of freedom in the Lagrangian contributes kT/2 to the total energy. The electromagnetic field - even in a finite box - has an infinite number of degrees of freedom, namely the Fourier components of the field that behave as a set of infinitely many harmonic oscillators. That's why the electromagnetic field should carry an infinite energy at any nonzero temperature.
Once again, you might try to modify this catastrophically absurd conclusion. If you're lucky, your name will be Max Planck and you will discover the correct quantum black body formula that will lead to the concept of a photon, after many confusing years. ;-)
The divergences and infinities that were obtained by extrapolations of the physical phenomena we could have directly observed at long distances were annoying. It was natural that physicists tried to modify the laws of physics. And indeed, some of these attempts were successful. However, it's interesting to note that the modification of the equations hasn't really changed the functions that appear in the fundamental equations - e.g. the "1/r" electrostatic potential. Instead, it has replaced the whole conceptual heart of physics, replacing the old classical quantities by operators on a Hilbert space (or one history by all histories combined by a path integral).
The equations controlling these operators remained pretty much unchanged. Who could have thought?
The infinite self-energy of an electron was another example of a disturbing divergence of the classical field theory combined with point masses. I told you that the attempts to modify the rules of physics are not too successful in this case. Quantization itself doesn't remove the problem: it gets translated into divergent one-loop Feynman diagrams we will discuss below. These divergences will be removed by renormalization - which is a process nearly independent of the detailed modification of the short-distance rules. You can imagine that some new short-distance structure calms down the infinities but all the details of such a structure will be inconsequential for all the predictions that you can actually extract from the theory.
When we discuss string theory at the very end, the details of the short-distance modifications of physics will matter again because the theory will predict the exact physical results rather than any kind of long-distance approximation. Before we get there, we must discuss singularities in general relativity and the new insights of quantum field theory.
Singularities in general relativity
When I said that well-behaved forms of equations of classical field theory normally preserve the finite character of the numbers, I neglected one important counter-example: the general theory of relativity, Einstein's theory of gravity.
In this theory, you can start with perfectly smooth and finite initial conditions, e.g. ones describing a star, and the evolution will inevitably end with a geometry (a field configuration) where certain quantities diverge at certain points in space and time: they diverge at singularities. Penrose and Hawking wrote down famous (singularity) theorems that imply that such a dramatic outcome is often inevitable.
This new aspect of gravity is related to the tendency of gravity to "clump" things. The natural final state that is being approached in the far future is not the uniform gas/liquid/solid that you would expect in non-gravitational physics or thermodynamics but rather one black hole that can eat everything. In non-gravitational systems, the uniform systems tend to maximize the entropy. In systems that gravitate, the entropy is actually maximized for non-uniform systems. That's why galaxies could have been born. The maximally non-uniform "bound" system with the highest entropy is a black hole whenever the gravitational force is a part of your physical cannon.
In the text above, I have described several annoying short-distance singularities in classical physics
- infinite self-energy of the electron
- atoms collapsing to a point
- ultraviolet catastrophe (thermal radiation)
- gravitational singularities
Quantum field theory
As we mentioned, the quantization itself works and quickly solves some of the problems. If we quantize point-like particles, the atoms immediately become stable. The "orbits" of the electron can't be arbitrary. The electron can't fall too close to the nucleus, essentially because of the uncertainty principle. The energy is bounded from below. There is a ground state. The atom in the ground state no longer radiates.
This quantum mechanical description of point masses can be seen to be a limit of quantum field theory (with Dirac fields added) so this success is reproduced by quantum field theory, too.
Quantum field theory is obtained by the application of the same rules of quantization - that worked for point masses - to the case of fields such as the electromagnetic fields. The fields can be bosonic and fermionic. And the massive fermionic fields (such as the Dirac field for the electron) will have a very convenient, realistic description in terms of individual point-like particles. Quantum field theory naturally unifies waves and particles. Each particle is a demonstration of a field and each field allows one to create and destroy a particle of some type. The incoherent mixture of point masses and fields that we used in one epoch of classical physics is replaced by a unified, coherent picture that may be surprising and hard-to-understand for beginners but once you understand it, it makes sense.
Quantum field theory also solves the ultraviolet catastrophe. Only the oscillators (Fourier modes) of the electromagnetic field with frequencies for which hf doesn't exceed kT by too much contribute (almost) kT to the total energy. The high-energy ones contribute much less and the total energy stored in the field converges. Problem solved. And we only needed to "add hats" i.e. to quantize the same equations.
The remaining two problems, the infinite self-energy and the gravitational singularities, are different. The infinite self-energy of the electron survives in quantum field theory. It is one of the "loop divergences" in quantum field theory. You might try to modify the rules of quantum field theories but at the end, you will find out that theories such as Quantum Electrodynamics should be used as effective field theories, i.e. theories that are meant to predict phenomena with a limited accuracy only - and the accuracy should improve according to a power law of the distance as you go to longer distances i.e. lower energies i.e. more accurate answers (but you may still demand an arbitrary accuracy as a function of your coupling constants, at least in the form of a Taylor expansion in the fine-structure constant etc.).
Once you know it and you have some idea how the modifications influence the predictions, you will realize that it makes no point to study the modifications in too much detail. You may encode them as irrelevant interactions. The physics you care about is encoded in a finite number of relevant and marginal interactions. If you want to improve the accuracy at shorter distances (more than you normally need), you can include a finite number of additional nonrenormalizable interactions, starting from the low-dimension operators. So you don't really answer the question from your childhood, "how is the electron exactly smoothed to become finite" but the old paradox goes away, anyway.
The divergences themselves can be cured by a combination of
Still, you see that the answers for the measurable quantities - probabilities of collisions etc. - would be infinite if you substituted the right numbers for Lambda or epsilon. Infinite answers are bad so you should not hurry with announcing the answers. ;-) Instead, you should realize that the fundamental constants that appear in your Lagrangian - masses, coupling constants, and the overall normalizations of various terms - are allowed to be infinite themselves.
It may sound counterintuitive for the fundamental constants in the Lagrangian to be infinite but there's actually nothing wrong about it because these numbers are not directly measurable. So they can be whatever they need to be in order for others to be happy. For renormalizable theories, it can be seen that it is always possible to choose the fundamental constants in the Lagrangian to be appropriate infinite numbers - such as "1/137+ln(Lambda)" or "1/137+1/epsilon" - so that the numbers that can be measured, such as the probabilities of collisions, are actually finite: the terms like ln(Lambda) or 1/epsilon cancel between the "infinite parts" of the fundamental constants and the "infinite parts" of the integrals that represent the loop diagrams - such as the self-energy of the electron.
It took a decade, until the late 1940s or so, before people realized that the (seemingly nonsensical because divergent) loop corrections are actually fully physical and contain high-accuracy information about the phenomena in electrodynamics (or another field theory). They learned how to extract this information by the procedure from the previous paragraph: by the methods of renormalization (i.e. allowing the fundamental constants to diverge in such a way that the physical answers are finite and a few measurable quantities are adjusted to the right measured values).
It took physicists 25 more years to understand the philosophical framework explaining why this seemingly bizarre procedure works. The framework is called the Renormalization Group and Ken Wilson pioneered it in the early 1970s. The Renormalization Group explains that the long-distance behavior (up to some chosen accuracy) of many physical systems is "universal" - in analogy with the critical behavior in statistical physics - and only depends on a couple of constants. That's why there must exist a toolkit to make predictions without inserting any new constants (and details about the "regulating physics"). For example, you only add the regulator "Lambda" - the maximum allowed energy - and do the calculations as you would normally do, with the appropriate choice of divergent fundamental parameters in the Lagrangian.
The previous paragraphs were mainly talking about perturbatively renormalizable but incomplete theories such as Quantum Electrodynamics - the most accurately experimentally verified theory of physics (for good reasons, this particular kind of leadership of QED will probably continue for a few centuries or more). From a theoretical viewpoint, QED is neither complete nor hopeless. It is somewhere in between.
However, there exist quantum field theories that are
- better behaved
- worse behaved
The worse behaved theories (than QED) are the non-renormalizable ones. For non-renormalizable theories, you obtain infinitely many "kinds" of different divergent integrals in the Feynman diagrams. All of them affect your predictions, even at the accuracy that you determined at the beginning. The fundamental constants in the Lagrangian must still have "divergent parts" but you need infinitely many divergent parts for these constants, in order to cancel the infinitely many types of divergences from the loop diagrams. And each infinite constant in the Lagrangian also has an unknown finite "remainder".
The parameter space of the types of "critical behavior" is therefore infinite-dimensional. Fermi's four-fermion interaction (responsible for beta decay i.e. weak interactions) and pure general relativity (responsible for gravity) are canonical examples.
Such theories need genuine modifications of the short-distance laws of physics to get rid of the infinities - infinities that are now really serious. In the case of the four-fermion interaction, you need to add W,Z bosons and transform your theory into a gauge theory with a spontaneous symmetry breaking: it's the only solution that makes any sense. In the case of gravity, you need to extend the theory to the full string theory: it's the only solution that makes any sense.
But I have to emphasize that the true problem wasn't the scary "infinite result". Physicists are not hysterically afraid of the figure "8" rotated by 90 degrees. ;-) What they're actually afraid of is a lack of predictive power. It is always possible to "cut" the theory so that the integrals will end up looking finite etc. But there are many ways to do so: in fact, there are infinitely many continuous parameters that parameterize the space of possible ways to get rid of the infinities. All of these parameters matter for the very questions you wanted to ask - for example, what is the cross section of the four-fermion interaction at the 250 GeV energies. These infinitely many unknown numbers are associated with the "counterterms", new terms in the Lagrangian with divergent coefficients that we mentioned above.
You would first need to determine the infinitely many parameters of your theory to be able to predict anything. And you would need a lot of time (probably an infinite time) to extract the infinitely many parameters from your measurements. ;-) Such a theory sucks. A theory shouldn't be required to be completely unique but in principle, after a finite number of steps or measurements, you should be able to supplement your theory with all the necessary information for the theory to actually predict. Theories with infinitely many unknown parameters are too bad. If you ask questions where only a finite number of such parameters are really important (given your pre-determined accuracy), it's kind of OK. But for questions where all of them matter - e.g. the questions about scattering at the energy scale where the divergences become really strong - the theory is clearly unusable.
You need genuinely new phenomena - and particles such as W,Z bosons or excited string modes - to get rid of the infinitely hard ambiguities. And they fix it. The better theories with the W,Z bosons or the strings are manifestly superior in comparison with their approximate four-fermion or general-relativistic counterpart. And they seem to be unique (up to dual descriptions of the same physics). We are used to describe general relativity as one of the most beautiful theories people have ever found, but from this technical viewpoint of ambiguities, it is completely analogous to Fermi's four-fermion theory. And string theory is analogous to gauge theory except that it is much more unique than gauge theories.
All the singularities and divergences discussed so far were related to phenomena at short distances (or high energies). But in quantum field theory, you also encounter other divergences that seem to be connected with very long distances and low energies, for example with the low (vanishing, in fact) mass of the photon.
While the problems at short distances told you that you should try to renormalize your theory or modify its short-distance rules - replace it with a more complete theory - the long-distance (infrared) divergences tell you something completely different. They tell you that you should keep your theory but replace your mouth, brain, or spokeswoman - because you have actually asked a meaningless question (or at least, you were sloppy).
For example, if you ask about the probability that an accelerating charged particle emits exactly one photon with a finite energy and nothing else, the theory will lead you to an answer with infrared divergences. Why? Because the accelerating particle actually always emits many very low-frequency photons (the literal answer to your original question - with one photon only - is actually zero and the divergences are an unusual expansion that tries to push towards zero).
And you should have asked what's the probability that the accelerating particle does what it does - but you should also allow the particle to emit an arbitrary number of virtually undetectable additional "soft" photons whose energies are below a certain conventional energy "e". Here, "e" must be nonzero for the question to be meaningful and to have a finite answer. But you can choose "e" very small.
If you formulate the question in this way, realizing that the very soft photons are generically produced (instead of pretending that you think that it's impossible), the infrared divergences cancel. The "dogmatic" form of your question where the number of photons in the final state was assumed to be exactly one is just physically incorrect. There are other examples of wrong types of questions that lead you to infrared divergences.
So now we know that infrared divergences don't prove any defect of your theory but a defect of your mouth because you have asked a sloppy question. So only the short-distance divergences may be viewed as problems of your theory.
String theory is based on new, more powerful principles, that actually remove all short-distance divergences. For example, in perturbative string theory, which already covers a huge portion of the questions you want to be answered by string theory, it can be shown that any potential short-distance divergence is actually a manifestation of an infrared divergence. (For example, one-loop closed-string diagrams can be mapped to a torus, and a thick torus can be rotated by 90 degrees and presented as a very thin one.) And there are many cases in which you can prove that such a potential infrared divergence is absent, anyway.
String theory is finite. It is not only free of divergences but it is also free of related ambiguities. As we have emphasized, the latter criterion is actually more important. Quantum Electrodynamics has one adjustable continuous parameter related to the interaction - the fine structure constant. The Standard Model has dozens of them. Non-renormalizable field theories have infinitely many of them. They're "completely" ambiguous.
String theory has none. It is a completely rigid, unique, robust structure - this fact is the true physical face of the finiteness. It has "parameters" in various environments but these "parameters" can be proved to be dynamical degrees of freedom. Their values can change in space. In realistic environments/vacua of string theory, it can be shown that all these dynamical degrees of freedom are massive, so they sit at some preferred value that minimizes the energy.
There is also a large, countable number of the minima of the potential energy across the configuration space of string theory: the landscape. But because the number of the vacua is countable, you don't have to measure any continuous number arbitrarily accurately to learn what the theory exactly is. Certainly, you don't have to measure an infinite number of quantities accurately in order to supplement your theory with the required parameters, like in non-renormalizable field theories, to allow it to predict things. You only need to get a piece of qualitative, discrete information: which of the vacuum was chosen in this Universe. A finite number of bits of information.
The fact that there exist countably many solutions to some equations shouldn't be surprising. Even a harmonic oscillator has infinitely many eigenstates. The function pi.sin(x)+sin(pi.x) - I really wrote random symbols - has infinitely many minima. It shouldn't be shocking that the correct theory of everything is at least as complicated as a harmonic oscillator or two sine functions.
The only class of singularities that hasn't been discussed so far are the gravitational singularities. String theory tells us completely coherent stories what happens with many types of such singularities - orbifolds, conifolds etc. - i.e. what new physical and understandable phenomena (such as topology change and branes) replace the mysterious source of infinite confusion and ignorance that the singularity used to be in general relativity. The confusion and ignorance about what e.g. the naked singularities could emit was the real problem: not the character "infinity".
There are also other singularities, such as the Big Bang singularity itself, whose detailed physics is not understood too well. These singularities are typically time-dependent (localized in time); consequently, they break supersymmetry. In the story above, there have been many conceptual revolutions that had to be made in order to remove the fog from various types of divergences and infinities. The infinities and divergences emerged as disgusting enemies but they were transformed into friends and powerful tools to find better theories and better answers to physical questions.
The case of the initial singularity (and analogous singularities) is arguably the only one that remains to be fully solved. And because this last mystery of singularities seems to be connected with the beginning of the world, it is reasonable to expect that the answer will teach us something about the selection of the right vacuum in the landscape and it will replace the provisional and probably wrong theories about the vacuum selection problem such as the anthropic principle.
And that's the memo.