## Wednesday, October 05, 2005

### Repeatability of models

Because William Connolley uses old-fashioned techniques imported from the Soviet Union to censor everything that is not convenient for his goals - for example he erases every comment that tries to put the questions about repeatability on firm ground - I must write a reply to his rather weird ideas (dominated by childish personal attacks against Steve Milloy) about repeatability of climate models on my blog.

In science, it is essential that discoveries are repeatable. It means, for example, that if you think that your new drug can improve the chances of the cancer patients to survive because you apparently observe that the survival rate increased a bit in your group, your statement is not yet established science. If your statement about the positive effect of your drug is scientifically true, anyone else must be able to get the same conclusion.

If you make a calculation whose outcome is the muon/electron mass ratio around 206.8, it will only become a part of science if others can repeat the same calculation.

If the other scientists or new teams repeat your experiments or calculations and get different results, such as no correlation, your theory is falsified. The potential for a theory or a claim to be falsified is what distinguishes science from other, less rational human activities. And be sure that it is not just Karl Popper who considered this idea obvious.

Theories must be falsifiable to be scientific. For example, if you design a very special theory that is only capable to statistically predict the outcome of an event that only occurs once, it is not a scientific theory. If you predict that there will be a big crunch, it is fine except that others will think that you are wrong. But if you only predict that the probability for a big crunch is 30%, it is no science because big crunch will occur at most once and the numerical value 30% has no scientifically definable meaning. It can't be quantitatively measured; it can't be tested.

Of course, once we have a complete theory of everything, we could become so self-confident that we would calculate even these probabilities although they can't be measured - without any need for further input from the observations. But believe me, climate science is very far from this perfect knowledge. We can only become self-confident about our knowledge of a certain class of phenomena after we successfully verify a large enough amount of quantitative predictions of our theories - an amount that must definitely be much larger than the amount of input parameters inserted to our theories. This is probably the case of the Standard Model :-) but it is definitely not the case of the climate science which is why we must be very strict about repeatability and falsifiability of its predictions.

In a similar way, if your theory designed to predict the climate is so special that you can only determine the statistical distribution of the average global temperature in the period 2015-2020 (you can't do it for the short-term temperatures because there are too many fluctuations in this regime) and your recipe will break down for other periods such as 2040-2045, then it's again not science. Only if you make predictions that are more or less certain, not probabilistic, they can make sense even though your event will only be tested once.

Computers often help us to solve a problem; an ugly feature of this approach is that we don't often understand ourselves what's really going on. But if you use computer programs that heavily depend on a random generator in such a way that their outcome may be random, there are new constraints that a scientist must obey. If the results of a random computer simulation are random, we can only deduce scientific predictions out of the model if we run the model sufficiently many times so that we may reliably reconstruct the probabilistic distributions - and especially the error margins - for the actual quantities that are well-defined and that we want to predict.

If someone runs a computer program only once, or if he picks one of the different results (or spaghetti graphs) that appear in one of many runs of his or her random program, then he or she is playing stupid computer games rather than doing science. If similar enterprise is to be called science, it must talk about more or less well-defined quantities (the statement "the graph looks like a spaghetti graph that resembles a hockey stick" is not well-defined), and these quantities must be calculable including the different contributions to the error margins.

Until you have a full control over the robustness (and error margins and dependence on various assumptions and initial conditions) of your calculations and simulations, you can't claim that you found scientific evidence for any conclusions.

I am pretty sure that it is not only Steve Milloy who would agree with these elementary comments. This fact makes it even more worrisome that William Connolley who is paid as a climate modeller does not seem to agree with these very basic scientific principles.

William, it is really not enough in science to generate a random spaghetti graph using a random uncontrollable computer model based on random initial conditions - based on something that you are rightfully afraid to show to anyone else - which is apparently what you're doing most of your time.

Steve Milloy's requirements - he demands a publishable program with publishable initial data including the tracing of their origin and initialization files - are obviously legitimate, your personal attacks against him are unjustifiable, and until you will be able to do science that respects the principles that we consider fundamental, you will still be - sorry to say - a crackpot in our eyes. There are very good reasons to think that the calculations behind many of the statements are just flawed: for example, the hockey stick graph was also argued to be reliable science before it was shown to be a gigantic pile of rubbish (and not just because of one error - it was because of a long sequence of independent and serious errors). This is why it is essential that the programs and their input are available to the scientific or general public for verification - and why your bloody efforts to avoid it only emphasize how inappropriate your approach is.