Saturday, March 09, 2019

Why inflation is a better explanation of the flatness problem than fine-tuned initial conditions

Sabine Hossenfelder tries to revive the inflation wars. She mentions two recent papers but they have really nothing to do with her main topic – and the topic is that she totally misunderstands 100% of inflation and why people consider it at all. Well, if the observation that she completely misunderstands inflationary cosmology isn't general enough for you, let me tell you something. It boils down to something even deeper: She just completely misunderstands what a "scientific explanation of anything" is and why people look for it at all.

Her final exchange with Ed Measure summarizes the situation well:
CapitalistImperialistPig: When Kepler came up with his laws of planetary motion, physicists of the day had the choice of either saying "hmmm, it seems like God like ellipses" or trying to find some deeper law that would account for them. Newton found universal gravitation and it was a great scientific advance.

Similarly, I think, cosmologists today have the choice of either saying "God likes flatness, uniform temperature,tiny initial density fluctuations, etc" or trying to find a deeper law that explains them. Inflation is such an idea.

Of course nobody has figured out exactly how to link it to deeper physics, but anybody with a better idea should chime in.

SH: CIP, I am afraid you entirely missed the point. I am telling you that it's not clear what you even mean by inflation "explains" it. Why do you think postulating a phase of exponential expansion is a better explanation than just postulating an exponentially small initial value?
Aside from all the wrong claims about elementary issues by the likes of her, I am greatly annoyed by the omnipresent arrogance in between the lines. She asks him "why do you think" and thus implicitly claims that the correct statement she can't understand is just "someone's opinion that is surely equal to her opinion".

But it's not just someone's opinion. It's a quantitatively demonstrable fact. One doesn't have to "think" in the sense of producing subjective or emotional guesses. Instead, one can think without quotation marks and just prove it by a simple argument. And her "opinion" is just objectively stupid. It is not something that has any importance in science whatsoever. Just like the opinion of the other 7 billion currently living people who completely misunderstand modern physics, her view is scientifically worthless.

First, note that the exact same statement of her type, "new scientific explanations are worthless", may have been made about the ellipses and Newton's laws, Pig's wisely chosen analogy:
CIP, I am afraid you entirely missed the point. I am telling you that it's not clear what you even mean by Newton's laws "explain" the orbits. Why do you think postulating a differential equation that has ellipses as solutions is a better explanation than just postulating an elliptic shape of the orbits?
But the denial of the power of both the "elliptic" and "inflationary" explanations is just a symptom of the speaker's complete scientific illiteracy.

Now let's answer the question posed by the "elliptic bee". Why are Newton's laws a better explanation than just postulating elliptical orbits (or epicycle-based orbits)? Because that hypothesis is more likely: it vastly reduces the number of arbitrary independent assumptions about Nature that we have to make.

For example, look at the 3D shape of the periodic orbits. Assuming that the planets don't go backwards, the trajectory may be written as a function in polar coordinates:\[

r = r (\varphi) = \sum_{n\in\ZZ^{+0}} d_n \cos n\varphi = \sum_{n\in\ZZ^{+0}} c_n (\cos\varphi)^n

\] I expanded the function to Fourier series. But I switched from the basis \(\cos n\varphi\) to the basis \(\cos^n \varphi\) which will be easier for our purposes. Note that e.g. \(\cos 2\varphi = 2\cos^2 \varphi - 1\) and \(\cos n\varphi\) may be analogously rewritten as a polynomial in \(\cos\varphi\), too.

Even assuming that the function is even, there are infinitely many independent real parameters \(d_n\) – the Fourier coefficients that replace the epicycle parameters in our more modern approach to periodic functions – that determine the shape. However, the ellipse that we observe (and that has the Sun in one of the two focal points) has a more special form:\[

r = \frac{p}{1+ e \cos\varphi}

\] which only depends on two real parameters, \(p\) which is dimensionful and \(e\) which is dimensionless. You may easily calculate the coefficients \(c_n\) in our (reshuffled) Fourier series by using geometric series:\[

c_n = p (-e)^n

\] Why is Nature choosing this 2-dimensional submanifold spanned by \(p,e\) in the infinite-dimensional parameter space spanned by \(c_n\)? A meta-question: Why do we ask the question "why" at all? We ask it because a priori, all configurations of \(c_n\) are possible. So the probability that the coefficients \(c_n\) obey all the identities such as\[

c_m c_n = p c_{m+n}

\] is "infinitely small". It's really zero. And be sure that for the Newton ellipses, all these identities are obeyed. Both sides are equal to \(p^2 (-e)^{m+n}\). Because any smooth distribution on the infinite-dimensional parameter space has the vanishing probability of obeying infinitely many conditions such as those above (even one condition is impossible to obey), the hypothesis that
some random choice of \(c_n\) is picked from the set of possibilities by Mother Nature
is falsified. It's refuted, disproved, ruled out, dead. The probability that the identities are satisfied is zero – or "infinitely small" – which means that the probability of the hypothesis, the coefficients \(c_n\) are some random real numbers, is zero, too. We have to look for a better hypothesis because this one has been falsified.

So we have to admit that the choice of \(c_n\) isn't really "random" from the space of possibilities. The patterns are real (the observed coefficients demonstrably obey them, as a little calculation shows) and they need an explanation. The choice of \(c_n\) isn't really random. There are some more specific rules and we need to know what they are. These "more specific rules" represent a different hypothesis that supersedes the refuted hypothesis that some "messy random mechanism picks \(c_n\)". Needless to say, Newton has found out that the elliptic form of the orbits for the two-body problems follows from his equations\[

F_i = m\vec a_i = -\nabla_i U

\] where \(a\) is the second derivative of the position \(\vec x\), \(i\) labels the celestial body, and \(U\) is the gravitational potential energy\[

U = -\sum_{i\lt j} \frac{GM_i M_j}{r_{ij}}.

\] That's it. Solve Newton's equations and you automatically get the elliptical trajectories including \(c_n = p (-e)^n\). All the identities that are obeyed by the experimental data may be derived from the theory.

Newton has brought a paradigm shift. Before Newton, the trajectories were measured and the parameters \(c_{i;n}\) were thought to be infinitely many independent continuous facts about Nature (and I generously suppressed the information about the shape of the trajectory in the spacetime i.e. the variable speed). After Newton, only the masses \(M_i\) of the planets, Moons, and the Sun and their positions \(\vec x_i\) and velocities \(\vec v_i\) at a given time were independent. Everything else followed from his equations.

This is progress because the theory according to Newton is more likely. There are still some choices that Nature had to make but these choices only include the "discrete" choice of the differential equation; and a vastly smaller, finite, number of parameters that label the initial conditions (masses, positions, velocities of the planets). The probability that the pre-Newton "random smooth trajectories" obeyed all the observed identities was zero. After Newton, it was 100% assuming that a small number of parameters is correctly matched.

All observed patterns that were repeatedly obeyed in many situations have been explained. To make things better, Newton's equations have also unified the motion of the celestial objects – such as planets – and the motion of the terrestrial objects – such as an apple that fell on Newton's head.

The situation is analogous when it comes to the observed curvature of the Universe and the inflationary cosmology and I discussed the details e.g. in the 2014 blog post Alan Guth and inflation. Again, we may measure some function of time which quantifies the curvature in the Universe, namely \(\Omega\). We already know equations, Einstein's equations, that imply that \(|\Omega-1|\) increases with time according to a particular functional dependence. The interpretation is that the Universe is getting more non-uniform.

Note that gravity's behavior is opposite to that of diffusion. If you consider a bowl of soup with the non-uniform temperature, the non-uniformities will decrease with time as the soup drifts towards a uniform temperature – towards the thermal equilibrium. However, gravity likes to "clump things" which is why the inhomogeneities of the Universe like to increase with time.

Today, \(|\Omega-1|\) is (already) tiny, much smaller than one, but we may extrapolate it to the past and find out that \(|\Omega-1|\) was even smaller. It was insanely tiny, like \(10^{-40}\) when the Universe was a fraction of a second old. But the number still had to be nonzero. If the Universe were exactly uniform, the non-uniformities (like galaxies) couldn't have arisen, either.

Because the big bang cosmology works up to some split second after the big bang (well, that effective theory works fine after that moment and still works for the Universe today and will probably work for tens of billions of years to come), we know that the non-uniformities really were this tiny, and we must ask "why", for reasons that are the same as above. We basically observe that the parameter \(|\Omega(x,y,z)-1|\) which measures the non-uniformities was equal to \(10^{-40}\) at some moment in the past, regardless of \(x,y,z\).

There was a special number, \(10^{-40}\), at some moment. The obvious questions are why was it the number and why the number is so tiny.

But there are also infinitely many identities, namely \(\Omega(x,y,z)=\Omega(x',y',z')\). The equally serious question is why this measure of curvature was equal at all regions of the visible Universe. These identities are analogous to the identities for \(c_m c_n\) above.

Just like we said "some random mechanism in Nature assigned the values to the coefficients \(c_n\)", we may say "some random mechanism in Nature assigned the values to \(\Omega(x,y,z)\) in various regions of the visible Universe". And we run to the very analogous problem again: the "random assignment" theory simply makes \(\Omega(x,y,z)=\Omega(x',y',z')\) infinitely unlikely, just like before. The "random theory" is excluded.

In fact, it's worse in the cosmological case. In the case of planetary orbits, the planets had a long time to adjust their coefficients \(c_n\) and make sure that the identities are obeyed. In the case of cosmology, the regions of the Universe hadn't have enough time to secure \(\Omega(x,y,z)=\Omega(x',y',z')\). Not even the propagation of the information by the speed of light was fast enough to "synchronize" \(\Omega\) at two distant regions of the Universe. The Universe just wasn't old enough – didn't have enough time (if measured in some way that is relevant for causality) – to make the curvature agree.

On top of that, the "universal" value of \(|\Omega-1|\) at that moment was smaller than \(10^{-39}\). It's a dimensionless number. The probability that a number randomly chosen from the interval \((0,a\) where \(a=1\) or \(a=2\) obeys \(x\lt 10^{-39}\) is around \(10^{-39}\), basically zero. It doesn't really matter whether \(a=1\) or \(a=2\) or whether you choose the distribution to be uniform after some rescaling. The precise likelihood that \(a\lt 10^{-39}\) is obeyed will depend on these choices but the fact that it's some tiny number practically indistinguishable from zero will not depend on them!

So any theory that is at least "somewhat" close to the assumption that "something random determined the curvature in the early Universe" will only predict or produce the observed facts, a tiny \(|\Omega-1|\) at that moment, and a uniform one, with a tiny probability that is zero for practical purposes. Such theories predict that what we observe shouldn't happen! It's really Hossenfelder's statement that "there is nothing interesting to be seen about the initial conditions" that is disproved by the observed patterns! In the previous sentence, the adjective "interesting" may sound vague and subjective but it's not really vague because it translates to "unlikely" and every genuine scientist actually maintains and must maintain a mental image that allows him to say what is likely and what is unlikely (interesting) in his field.

Inflationary cosmology fixes these problems in a way that is indeed completely analogous to Newton's fix of the orbits exhibiting the unexplained elliptic "coincidences". It inserts a phase into the evolution of the Universe in which \(|\Omega-1|\) was decreasing with time – because the Universe was being inflated like a balloon, and when you inflate a balloon, it becomes smoother and closer to a perfect sphere. A sign driving the time derivative of \(|\Omega-1|\) is reversed which is very good because this sign flip turns the "very unnatural" conditions at that early moment to "basically unavoidable" consequences of the evolution – whose initial conditions may be "basically anything". And that's why, at the end of that stage, the Universe ended with \(|\Omega-1|\) that was tiny and uniform across the visible Universe, much like the balloon. Also, thanks to the inflation, the Universe had enough time to synchronize the values of \(\Omega\) in various regions. The big bang didn't have enough "time" in the causal sense. In inflation, however, the "causal" age is the number of e-foldings (times the de Sitter radius, if you wish), and because the the number of e-foldings is over 50, the regions of the Universe had had enough time to send 50 messages back and forth.

Hossenfelder's "there is no point in asking why" is exactly equivalent to the claim that there is nothing to explain if you see a large inflated rubber balloon with helium inside. It's very precisely round and almost flat, why not? Well, "why not" is answered by showing that "these adjectives are insanely unlikely" if your explanation involves some random processes with no special structures, "nothing interesting to be seen here". Such random theories simply predict – with certainty that is almost 100% – that the rubber balloon can't be this uniform and close to perfect. So these theories are dead.

Instead, to explain the round inflated balloon, you need to accept a theory that has some extra pressure inside the balloon and that tries to minimize the potential energy from the stretched rubber plus the energy from the compression of the air. The minimization of the energy implies that the pressure inside the balloon will be uniform and somewhat higher than outside the balloon; and this uniform extra pressure will be proportional to the extrinsic curvature of the surface of the balloon (to balance the forces) which will therefore be uniform, too: a spherical balloon follows.

An introduction to balloons. The difficulty was optimized for Ms Hossenfelder.

We do observe the cosmic rubber balloon and one needs an explanation for the same reasons why we need explanations for the identities of the Fourier coefficients \(c_n\) describing the orbit, or why we need an explanation why a balloon with helium is so close to a sphere. The reason is that "the explanation without the clever ideas" makes predictions, with the certainty near 100%, that are refuted by the observed data. The "non-clever explanation with some random mess" simply predicts, in all these cases, that the special patterns will almost certainly not emerge. But they do emerge – we need a different theory that is not dead yet.

One could complain – and Hossenfelder apparently does – and say that she doesn't need to assume any "random" distribution at all. Instead, she assumes "no distribution" for the parameters of the theory (such as the Standard Model) or for the initial conditions (of the big bang cosmology). Without any assumptions about the distributions, she makes no predictions and she has no theory that is being "ruled out". Great. But if one doesn't make any assumptions – not even probabilistic ones – about the origin of the observed parameters or the apparent initial conditions of the Universe, it simply means that she is not doing research at all. At most, she is a user of theories built by others. She embraces them without questioning just like believers take the interpretation of the Bible from their priest – it also works fine for them. If she assumes no distribution of any parameters etc., then she is not thinking about these matters (the validity and origin of currently used theories) at all and she shouldn't pretend that she has something to say about these matters. Actual researchers are doing more than just "blindly trust and use" existing theories.

So it is Sabine Hossenfelder, and not Ed Measure, who is missing something important. She completely misunderstands the purpose and logic of explanations of any observed patterns and facts in natural science! She is an arrogant dumb layman who believes in scientifically worthless misconceptions and it's terrible that the media are trying to obfuscate this key fact about her. Now she has written a whole book that is all about her pride about her rudimentary misunderstanding of all explanations in science so she is probably not too motivated to admit that she has been wrong about almost everything throughout her life.

Let me remind you that Newton's theory has united the motion of the celestial and terrestrial objects – the motion of planets and apples is governed by the same equation. A similar unification exists in the case of the inflationary cosmology, too. The inflationary cosmology needs a scalar field – that sits at some value with \(V(\phi_{\rm inflation})\gt 0\) during the inflation but then goes to \(V\to 10^{-122}\) to make sure that the inflation ends (and we know that the inflation in the normal sense is no longer taking place now).

This scalar field is the same kind of an object as the Higgs field that is needed to make particles massive in the electroweak theory. The general form of the equations for these scalar or Klein-Gordon fields is the same. In fact, there exists a version of inflation in which the Higgs field and the inflaton is the exact same thing – but I don't want to claim that these are the best cases of inflation because they have some disadvantages. But even if you allow several scalar fields, inflation just reuses some ideas that already exist in the electroweak theory. In this sense, the field equations for scalar fields such as the inflaton and the Higgs boson give us a unified explanation why the Universe is so large/flat/uniform and why the particles of the Standard Model are massive!

And, as Pig mentioned in another comment, inflation doesn't just explain why some numbers are small or large. It explains not only the uniformity of the temperature in the Universe but also the small but finite deviations from that uniformity – the whole experimentally measured WMAP curve seems to be predicted by the theory. The "number of numbers/predictions" hiding in that curve is arguably greater than the number of parameters of the epicycle orbits predicted by Newton's theory.

The Prague Astronomical Clock (Orloj) was first installed in 1410, over half a century before Copernicus' birth in 1473. You can see why the superposition of circular motions had to be attractive or "natural" for the folks around 1400: they could actually build machines that emulated all that motion. But this ease of theirs didn't mean that their models were the final word.

This must be evaluated as progress, a more acceptable explanatory framework, and Sabine Hossenfelder doesn't "disagree" because she has some alternative ideas that make sense but because she is just incapable of understanding everything in modern cosmology (and particle physics since the 1970s) that makes sense. Her alternative theory – if one formulates it at least a little bit coherently – is ruled out by the observed data. She is incapable of deriving this conclusion but that doesn't make the conclusion any less valid. That's why people had to look for a more viable theory explaining how the Universe could have ended in the extremely flat form a split second after the big bang.

Bonus: Motle und Bailey

On Saturday, she also wrote that particle physicists are guilty of the Motle-und-Bailey rhetorical trick. It's a bait-and-switch trick in argumentation named after a medieval pair of fortresses. You want to defend a low-lying courtyard, Bailey, but it's hard. When Bailey is attacked, you escape to the high castle on the adjacent hill, Motle, which is easily defensible. As you know very well, Motle is universally considered the pure truth and nothing but the truth. In your comfort provided by the Motle, you claimed that you defended the Bailey because they're basically the same thing (equivocation).

She accuses the pair Nima-and-Lisa of collectively doing Motle-und-Bailey (because they disagree on whether SUSY is a motivation to build the next collider); and Brian Foster, an antagonist of hers on the radio, of doing it individually.

Well, the construction of a particle collider is a complex enterprise which has many applications and motivations (and different people's have different motivations and priorities – there is just nothing wrong about it) and in this sense, it is really at least as complicated as the Motle-und-Bailey fortresses. Indeed, the "case for the FCC or ILC" has many "legs", some of these legs are more certain, others are less certain, another group is almost completely uncertain or speculative. For this reason, I would claim that at least in this case, Motle-und-Bailey isn't a fallacy at all and her talk is just another part of the somewhat creative but otherwise vacuous rhetorical exercise designed to sling mud on particle physics. In effect, her blog post says that a new collider is only meaningful if all physicists produce the same and simple explanation why it should be built – but that assumption of hers is clearly incorrect.

Some good signs at the end:
The real tragedy is that there is absolutely no learning curve in this exchange. Doesn’t matter how often I point out that particle physicists’ arguments don’t hold water, they’ll still repeat them.
Maybe we see a promising beginning of her own learning curve here and after 15 more years, she will figure out that all scientifically literate people consider her an incoherent or demagogic hostile babbler and ignore her, indeed.

No comments:

Post a Comment