Sunday, June 30, 2019 ... Deutsch/Español/Related posts from blogosphere

Higgs mass from entropy maximization?

As you know, the Higgs boson is the most recently discovered fundamental particle (next Thursday, it will be 7 years from the discovery) and its mass seems to be \(m_H=125.14\pm 0.24\GeV\) or so. In various models, supersymmetric or scale-invariant or otherwise, there exist partial hints why the mass could be what it is and what this magnitude qualitatively means.

Reader T.G. had to be blacklisted because he was too vigorous and repetitive in defending the highly provoking 2014 Brazilian paper

Maximum Entropy Principle and the Higgs Boson Mass (by Alves+Dias+DaSilva, about 12 citations now)
which claims to calculate the almost identical value \(125.04\pm 0.25\GeV\) using a new assumption, entropy maximization. What are they doing?



Look at this chart which they omitted, for unknown reasons, so this blog post is more comprehensible than the paper.

The horizontal axis is the Higgs boson mass \(m_H\) and the vertical axis shows the branching ratios of Higgs decays – the probability that the Higgs decays to some final products or others. You may see that near the observed value \(m_H \sim 125\GeV\), there are relatively many decays that are relatively close to each other. None of them is dominant and beating all others etc.



OK, the Brazilian folks simply postulated an obvious idea that you could have – what if Nature tries to maximize the diversity? OK, so they took the branching ratio as a function of the variable Higgs mass from some software on the market and maximized the Shannon entropy\[

S = -\sum_{i=1}^m b_i (m_H)\ln b_i (m_H)

\] where \(b_i\) are the branching ratios i.e. the probabilities of qualitatively different decays.



What are the decays in the Standard Model? They are
  • \(\gamma\gamma\), \(W^\pm W^{\mp *}\), \(ZZ^*\), \(\gamma Z\)
  • \(e^+e^-\), \(\mu^+\mu^-\), \(\tau^+\tau^-\)
  • \(6\times q\bar q\)
  • \(g\bar g\)
The asterisk indicates that one of the particles is virtual – which is the case if the real particles in the final state would be too heavy.

OK, it's the three "particle-antiparticle" pairs of electroweak gauge bosons, the fourth similar but asymmetric decay is \(Z\gamma\), then there is the gluino pair, six quark pairs, and three charged lepton pairs. Now, the top quark is too heavy, heavier than the Higgs boson, so in the range of possible Higgs masses below the top quark mass, both tops would have to be virtual and the top-antitop Higgs decay is very unlikely.

That means that instead of fourteen, they just consider \(m=13\) decays, calculate the entropy as a function of the Higgs mass, and claim that \(125\GeV\) is the value that maximizes the entropy – or the diversity, if you wish. Cute. The arguments in favor of this discovery are obvious:
The latter tells you that if you know a probability distribution imperfectly, you should choose the distribution that maximizes the Shannon entropy \(-\sum_i p_i \ln p_i\). It's surely a special distribution in any restricted subclass, there is something canonical about it. However, the precise explanation "in what sense the Shannon-maximizing" distribution is "the best" is subtle and people may easily overstate its importance or unavoidability. In particular, I would say that if you obtain some probability distributions by proper Bayesian inference, you shouldn't replace it with a different one just because this different one maximizes the Shannon entropy. Instead, the prescription is only valid if you know the proability distribution "incompletely". But an incomplete distribution with "holes" etc. is something that you can't really get from measurements of the system if those are complete enough.

Nevertheless, the maximum entropy distribution principle does recommend you to choose the "maximally ignorant", egalitarian-divided probabilities for the degrees of freedom that are unknown and whose uncertainty is unknown. OK, they maximize the function of the Higgs mass and claim that \(125\GeV\) is the sweetest spot.

You may get overly excited by the positive arguments and neglect the doubts – suppress your skepticism. I think that in that case, you must be considered a numerologist. Like a broken clock, a numerologist may be right twice a day, of course. But the reasons to dismiss the result are really more powerful and fall into three categories:
  • errors and unnatural choices in the calculation even if you accept the fundamental premises
  • the apparent inability to calculate anything beyond the single number, the Higgs mass, using this apparently ambitious principle
  • the acausal character of the implicitly suggested "mechanism" which indicates that it should be impossible for such a maximization rule to operate in Nature
Concerning the first class of the complaints, I think it must be wrong that they consider just the 13-14 channels and the corresponding 13-14 terms in their entropy. Why? Because locally, the color of the quarks and gluons should be considered distinguishable.

They were not calculating any "well-established kind of entropy in the context" or the "real entropy of any parficular physical system". They just took the Shannon entropy formula and substituted some numbers that look "marginally sensible" to be substituted. Because there's no meaningful underlying theory, I can't prove what is "right" and "wrong". Their formula is really their axiom, so it's "right" in their axiomatic system.

But I find it extremely unnatural that there is no coefficient of \(3\) in front of their terms for the quark channels; and the corresponding factor of \(8\) for the gluon channel. For example, the gluon branching ratio should really be divided to 8 equal pieces \(b_{gg}/8\) and their logarithms additively differ by \(-\ln 8\). These extra terms \(\ln 8\) multiplying the gluon terms in their equation would modify the function \(S(m_H)\) and the maximization procedure.

Concerning the second class of the complaints, their entropy maximization principle seems really cool. It is numerologically claimed to work for the Higgs boson. But if such a principle worked, wouldn't it be strange that it only works for the Higgs boson mass? The Higgs boson mass is just one parameter of the Standard Model. Shouldn't it work for the quark and lepton masses – or their Yukawa couplings – as well? Or the masses of the electroweak gauge bosons and/or the gauge couplings? Even if the principle only determined one parameter, shouldn't it be a more generic function of the Higgs mass and other parameters rather than the Higgs mass itself? Why the precise Higgs mass, a particular coordinate on the parameter space? And shouldn't such a principle ultimately determine even the constants that don't seem to be associated with decays such as the cosmological constant?

You know, the claim – pretty much a claim that they try to hide – that such a procedure only determines the Higgs mass seems like a classic sign of a numerological fallacy. Numerologists love to take some number, completely take it out of the context, and produce some "calculation" of this number. They ignore that if some deep principle determines this number, the same principle should really determine many other numbers. The numerological derivations of a number have usually nothing to do with the "context", what the mathematical constant is actually supposed to represent. By definition, numerologists are too focused on patterns in numbers and largely ignore what the numbers are supposed to mean.

They don't seem to discuss this problem at all which indicates either that they're deliberately obfuscating problems, which is dishonest, or they don't understand why this is a problem for almost all similar numerological determinations of any constants. In both cases, it's just bad. Aside from their overlooking of the color degeneracy factors, this is another reason to conclude that they're simply not careful physicists. And this conclusion makes it likely that they have also done some other errors, perhaps completely numerical ones, but (because my belief in the paper is close to zero) I totally lack the motivation to find the answer to the question whether such mistakes also plague the paper.

Concerning the third complaint, well, such a maximization of entropy should be impossible for causal reasons. The higher diversity of the Higgs decays doesn't seem to "useful" to explain anything; there is no known carefully verified rational reason why it should be true. Unlike the extreme anthropic principle that favors "universes with many intelligent observers" because the intelligent observers are not only "like us" but useful for doing any science, and in this sense a desirable component of the universe, the diversity of the Higgs decays doesn't seem to be good for anything. So the justification is even more absent than in the case of the "strong anthropic principle" – and that is already pretty bad.

The reason why this diversity should be a "cause" of the selection of the Higgs mass is lacking. Even more seriously, one may apparently prove that such a determination should be impossible. Why? Because the Higgs mass – and parameters resulting from some vacuum selection – took place when the Universe was extremely young, dense, and hot. Perhaps Planckian. And maybe even more extreme than that. At that time, Higgs bosons didn't have the low energy and didn't have the freedom to decay to something at low energies "almost in the vacuum". Everything had huge energy and was interacting with other particles – whose density was huge – constantly.

So the "13-14 low-energy decay channels of a Higgs boson" weren't even an important part of the physics that governed the very early Universe when the vacuum selection choices were made! So how could the Universe make a choice that would maximize some entropy calculated from some low-energy phenomenological functions – which only seemed empirically relevant much later (but still a fraction of a second after the Big Bang)? It just doesn't make any sense. Such a mechanism could only work if the causal ordering were suppressed (which would almost unavoidably imply a conflict with the usual, causal laws of Nature that determine the evolution differently) and the universe were really planning the future in a teleological way. But why should exactly this kind of diversity be God's plan?

Also, many superficial people who just defend some "entropy maximization" typically fail to understand that the right reasons and mechanisms for the entropy maximization in physics are known. They boil down to the second law of thermodynamics. The entropy goes up because the probabilities of a transition between the "initial ensemble of states" and the "final ensemble of states" is averaged over the initial states but summed over the final ones. That's why the probability of the inverse process is effectively suppressed by the factor of \(N_i/N_f\) which is why the evolution favors the evolution to higher-entropy states. This is the cleanest justification why the entropy doesn't want to go down.

The second law of thermodynamics is a qualitative law but we actually know these quantitative proofs of this law and these derivations – similarly the H-theorem (the objection that it depends on "shaky" assumption such as the ergodic principle are bogus – all these assumptions are surely valid in practice) – tell us not only that the entropy goes up but also how much and why. If you postulate another "high entropy wanted" law for Nature, it may look like being a "morally allied" with the second law of thermodynamics. But because the details of your law – how high the entropy wants to be and why – will be different, your new law will actually contradict the well-established detailed derivations behind the second law!

So the paper is hopeless.

Nevertheless, over the years and even recently, I've spent dozens of hours by "spiritually similar" attempted derivations. In particular, those derivations were a part of my Hawking-Hartle research. The Hawking-Hartle state is the preferred wave function of the universe – especially applicable as the initial state of the universe – which is a solution of the Wheeler-DeWitt solution and may be obtained as a path integral in a spacetime region that isn't bounded by two boundaries (carrying the initial and final state), as appropriate for the calculation of the evolution or S-matrix, but just one boundary (a three-sphere surrounding the Big Bang point).

The Hartle-Hawking state is clearly a possible paradigm to explain the parameters of Nature that won't go away unless another paradigm, like the multiverse, is really established – or some remarkable, rigorously provable bug is found in the Hartle-Hawking principle of the most general type. Or someone makes the Hartle-Hawking paradigm rigorous and quantitative and checks that it makes wrong predictions. But the Hartle-Hawking paradigm hasn't been terribly successful and naive, minisuperspace calculations of the Hartle-Hawking state dominate the literature.

Well, I always wanted to apply it as a rule to determine the right vacuum of string/M-theory. All the numerous details about the compactified dimension could arise from the paradigm – while the non-stringy Hartle-Hawking literature is obviously too obsessed with the four large spacetime dimensions. If that's true, the Hartle-Hawking wave function could be peaked near the vacua with some qualitative properties. It could be peaked around vacua with a low cosmological constant or high Planck-electroweak hierarchies or high hierarchies in general or low Hodge numbers of the Calabi-Yau manifolds, among other "preferred traits".

If the Hartle-Hawking paradigm is correct at all, and if string/M-theory is correct, which are two independent assumptions, then it seems extremely likely that the Hartle-Hawking state would prefer some qualitative traits of the string vacua. And they could be directly relevant for the explanation of some observed traits in Nature – such as the observation of some particular hierarchies or deserts.

The Hartle-Hawking state would still allow many different vacua because it gives rise to a smooth probability distribution. But it could be peaked and the peak could be rather narrow near some point. The maximization needed to find such a point could be mathematically analogous to the Brazilian paper that I have discussed above.

But the details matter. The Devil is in the details. And the details – which are not really small details, if you look at it rationally – imply that this Brazilian paper is hopelessly wrong and irrational. If someone is on a mission to promote it on the Internet and insult everybody who has good reasons not to take this Brazilian paper seriously at all, it's a problem and a ban becomes the optimal solution.

Add to del.icio.us Digg this Add to reddit

snail feedback (0) :