Sunday, June 15, 2008

Predictions vs understanding

In this essay, I would like to discuss some differences between "predictions" and "understanding" as two slightly distinct outcomes of theoretical research.

Some people like to think that "predictions" are the only purpose of scientific theories. Such a belief largely contradicts the main motivation behind theorists' work. Theorists don't want to ask the question "what will happen" all the time. They like to ask the question "why", too.

The question "why" is actually very important. Even if our predictions of particular phenomena don't change at all, it is still possible that our theoretical framework evolves. And it may sometimes evolve dramatically. The explanations get unified and unnecessary excess baggage is jetisoned, as Murray Gell-Mann says in the commercial for a famous ex-company above. ;-)

Such improvements of the "why" questions often help us to answer the "how" and "what will happen" questions more accurately later. Theorists often ask themselves:

Do things make sense?

Even if they have some formulae that satisfactorily predict the empirically observed numbers, these formulae may make a lot of sense or much less sense. Our explanations may look natural or they may look contrived. If an explanation looks contrived, it doesn't mean that a theory and its predictions are wrong.

But it doesn't mean that theorists will ignore the contrived features of their theory, either.

When an explanation looks contrived, a theorist is not fully satisfied even if the predictions seem to be accurate. The status of such an explanation is somewhere in between an acceptable theory and an unacceptable theory. Such a fuzzy status of an explanation may be transformed into a well-defined problem if an actual sharp contradiction is found. But a semi-satisfactory theory may also become fully satisfactory if all the reasons why the theory look contrived are proven to be artifacts of our wrong assumptions and misleading intuition.

Black box answering all questions

In order to understand the difference between our capability to predict and our ability to understand, consider the following situation. You are given a black box - or a computer program, for that matter. You enter the information about the particles you want to scatter and the black box spits out (the statistical distributions of) the quantities measured by the detectors.

And the black box will always be right. Using the box, you may predict the results of all scattering experiments.

In this situation, should you be satisfied with our explanation of particle scattering? I think that the answer is obviously No. The black box may know the answers but "we" don't know it. We can't identify ourselves with the black box because we don't know how the black box works. With the black box, we may predict results of experiments.

But the mystery really hasn't disappeared at all. In fact, it is plausible that the method how the black box answers the questions is simply that the black box contains a particle collider inside and it performs all the experiments. If it is so, the predictions of the black box are not real predictions of experiments: they are the experiments themselves.

But even if we admit that the black box works differently, the mysterious question "why do the particles do what they are observed to do?" is replaced by the equally mysterious question "why does the black box say what it does?". We haven't made much progress in our understanding of particle scattering unless we know something about the internal workings of the black box.

Can we trust computers?

In reality, we are frequently using gadgets that are somewhat similar to the black box above. We use computers to do complicated calculations for us. For example, lattice QCD people use supercomputers to calculate the properties of hadron. If their calculations match the experimental data, should we be satisfied?

This question is more subtle and there is a reason to be satisfied. Why? Simply because we know how the lattice QCD programs work, at least roughly. We know the QCD Lagrangian and the computational procedures that the programs employ.

On the other hand, we may still feel that if a computer does a very complicated sequence of calculations in order to produce a qualitative answer, we still don't understand why the answer is what it is. We may often look for an easier explanation that can be formulated without the brute force of the computers.

It is important to realize that we are never guaranteed that such a better explanation exists. If a correct formula that appears in our explanation or the qualitative patterns of an explanation look truly regular, accurate, or if they seem to agree with some simple mathematical functions, it is sensible to expect that a better explanation than a huge sequence of numerical calculations should exist. While it is reasonable to expect it, it may still happen that subsequent research proves that such a "simpler" explanation cannot exist.

How much unified our theories have to be?

We normally want our explanations to be as universal as possible. In other words, we want our descriptions of different pieces of reality to be as unified as possible. If the correct predictions of a class of phenomena may be extracted from a compact formula or a concise set of rules, we always tend to think that the explanation is better than a very contrived explanation that predicts the phenomena with a similar accuracy.

Everyone knows why Newton's theory is a better explanation of the motion of planets than epicycles with properly engineered 1st, 2nd, and 3rd order corrections. The simplicity of Newton's laws is not just a matter of aesthetics: the simplicity also encodes their universality. When you use them properly, they predict not only the motion of planets but also the motion of apples and moons, among other things.

We are looking for compact, unified descriptions of reality because these descriptions are more likely to be correct even if they predict equally accurate results as some contrived, fragmented explanations with many undetermined parameters or many independent assumptions. Why? Simply because the actual reason why the contrived explanations work could be that they have been "adjusted" to agree with reality. Contrived explanations are elements of large classes of families of explanations and the probability that at least one element agrees with reality "by chance" (even though it is fundamentally incorrect) increases with the size of the family.

When compact explanations, simple formulae, and theories based on a small number of assumptions and parameters agree with observations, we have a good reason to think that the agreement is not just a coincidence.

These comments are usually understood well in the context of the climate science. If a climate model - and the process to compare it with the measurements - depends on many choices, parameters, conventions, and methods, the agreement is less spectacular and less surprising. You may always think that the choices, parameters, conventions, and methods have been deliberately chosen to improve the agreement but the agreement may be a matter of "coincidence".

If the agreement is less accurate and describes less detailed features of the data, it tells us less about the validity of the theory, too.

However, the question of "adjusted theories" is almost completely misunderstood by the public in the context of theories of fundamental physics. If a theory has a landscape of solutions - e.g. the landscape of string theory - it certainly doesn't mean that everything goes. The number 10^{500} of the elements of the landscape may look large to the laymen but it is an extremely tiny number of possible theories that should agree with all observations ever made.

If you imagine that the mankind has only performed 1,000,000 a priori "independent" experiments (and it is surely an underestimate!) and there were roughly 10 possible outcomes of each experiment (another underestimate), the number of possible outcomes of the set of experiments is 10^{1,000,000}. The probability that you will correctly describe all of them with one element of a 10^{500}-strong landscape "by chance" is completely negligible, especially if the 10^{500} elements of the landscape share virtually all "qualitative" features.

Again, your explanation would be much more convincing once you knew how to pick the right element from the landscape. But a scientist can't build on wishful thinking. Nature has the universal right to say No. The number of solutions that are acceptable at the present level of our knowledge simply is 10^{500} or higher. And we must be comparing the explanations we have.

String theory and effective quantum field theory are the two frameworks that agree with all particle physics experiments that have been made so far. While string theory only has a "finite" or at most "countable" number of possibilities, there is a "continuously infinite" number of quantum field theories because their coupling constants and other parameters must be understood as paramaterizations of "different theories".

When you do the calculation properly, string theory is far more predictive and constrained than quantum field theory, even with the 10^{500} elements. This is a conclusion that most laymen are completely incapable to do because they are simply overwhelmed by numbers of order 10^{500} and they can't distinguish them from infinity. But in certain contexts (for example if you want to answer all questions about Nature), these are damn tiny numbers.

Note that the entropy of the Universe exceeds 10^{100}, a googol, which means that the minimum dimension of the Hilbert space needed to describe the whole Universe is 10^{10^{100}}, a googolplex. This is way too higher than 10^{500}. And the number of possible configurations of experimental results in all experiments, if you count all microscopic parts of the experiments as separate entities, is closer to the googolplex, too. It is extremely unlikely (the probability is close to the inverse googolplex) that the results will agree with an element of the landscape whose size, 10^{500}, is closer to a modest googol rather than to a googolplex.

Moreover, even if you disagreed that string theory is much more constraining than effective quantum field theories and you would think that they have an equal number of possibilities, you should note that this is a "tie" between the two frameworks in this particular discipline. With such a "tie", you have no rational reason to think that effective quantum field theory is a superior framework just because it is older than string theory. With your assumption, they are "equally predictive" and you should give them equal chances.

Moreover, we can easily see that effective field theory becomes completely unpredictive in the context of Planckian physics and quantum gravity while string theory becomes extremely well-defined and predictive in that regime. That's of course the main reason why string theory is superior as a description of the physical reality. But as I have emphasized above, even if the "predictive power" were your main criterion, string theory would be at the top. In the extremely traditionalist approach, it would be there together with effective quantum field theory.

Finally, I want to mention that we have talked about 1,000,000 of "a priori independent" experiments. Needless to say, all these experiments are not "a posteriori independent". All known experiments are manifestations of the same Standard Model or the same general realtivity, after all. But the fact that the experiments have become "dependent" or related is nothing else than the existence of a theory - or a unifying theory - itself. The less independent various phenomena in Nature look, i.e. the more interlinked (and unified) they are, the more we have understood about their origin.

Can theories always be visualized?

Let me close this topic and mention another feature that people like to expect from attractive theories. They can "imagine" or even "draw" what they mean. The internal mechanisms can be "visualized" and "seen". If it is possible to imagine - or to create a visually impressive TV program about - a theory, the theory feels "superior".

In my opinion, this sentiment is a sentiment of the laypersons. In reality, there is no reason for a theory to be ready for an easy visualization. An impressive visualization may help a theory to be accepted by the people but the existence of such a visualization doesn't imply that the theory is more correct or more fundamental. Many correct theories that describe the reality at the fundamental level are notoriously hard to imagine or visualize simply because our cognitive abilities - and the movie industry - have been trained to imagine and visualize very different types of phenomena from those that are essential at the fundamental level.

So I think that the expectation that all things can be "seen" is unjustified. But even if you believed that the expectation is sensible, you should never treat it as a dogma. When you refine your ideas, the conjecture that a process can be visualized in a certain way is just another conjecture about the Universe. And conjectures in science can be incorrect. And they can often be proven incorrect.

The hidden variables in quantum mechanics are one of the best examples. A classical picture of the internal structure of quantum mechanical objects has been a dream of many people. The fathers of quantum mechanics realized that there was no reason why such a mechanistic description should exist. And in fact, after a few decades, we became able to prove that such a classical description is impossible as long as you accept that the Universe respects causality and locality at least at some basic level.

It has been proven that a particular project (of hidden variables) to rewrite the laws of physics in a "visualized" fashion has been doomed from the start. It doesn't matter that the project looked attractive to you. Nature always has the right to veto your theoretical projects. ;-)

Should we be satisfied with our understanding?

This brings me to another, related question: should we ever become satisfied with our present description of reality? Imagine that the description works but it looks a bit contrived and the reasons behind its detailed properties don't seem to be manifest.

Again, I would like to emphasize that the answer to this question can't be universal. If a feature looks contrived, it may be a sign of an inconsistency or incompleteness of your present theory. But such an awkward feeling may also be your personal psychological problem caused by your incorrect assumptions. When scientists study their theories and their relations to the relevant phenomena and other phenomena more carefully, they may usually decide which answer is correct.

In some cases, the contrived theories are shown to be incomplete and a new description that is more universal or that makes more sense is found later. In other cases, the feelings of "dissatisfaction" are proven to be misguided psychological feelings based on certain assumptions and these assumptions may be shown to be incorrect. Both possibilities are conceivable and a rational person can never be certain about the answer from the very beginning - only a dogmatic person can.

Can everything be predicted?

A similar question is whether we should expect that our theories will answer all "why" questions or whether some of their counterintuitive features will remain confusing or unexplained forever. Once again, we can't ever be completely sure.

A priori, we want to explain as much as possible. We want to ask new "why" questions whenever we can. On the other hand, there may also exist - and there often do exist - real limitations caused by the laws of physics. These limitations may imply that a better explanation can't exist. They may imply that all of our ideas how a "better" or "more fundamental" explanation should look like may be proven to be incorrect. Whether or not they look a priori intriguing is secondary.

None of these questions has a universal answer and we must always be ready to invent rational arguments and listen to rational arguments of others because in some cases, it is very important and fruitful to search for better explanations while in other cases, it can be shown to be a misguided philosophical exercise driven by irrational assumptions about the reality.

Are "useful" predictions necessary?

Finally, let me emphasize a point that has been written many times on this blog. A real theorist is not driven by practical applications of his work. He is driven by his desire to understand how the real world works. He wants to know about every awkward aspect of our newest theories and he wants to know which of them can be improved by finding a new theory - or by interpreting it in a new way - and which of them can't.

No comments:

Post a Comment