Monday, January 09, 2006 ... Français/Deutsch/Español/Česky/Japanese/Related posts from blogosphere

Connolley in Monte Carlo

Have you seen this cute video? (Via G.D.)

In the previous posting, we tried a couple of thought experiments. One of them was to imagine that the "global temperature" behaves like Brownian motion (which is approximately as good an approximation as the "random noise" around a "long-term average" in which the years are independent - because the long-term persistence in the real world exceeds that of the random noise but is lower than for Brownian motion).

By the term "Brownian motion", I always meant a random walk that generates a binomial distribution. Sorry for my sloppy terminology.

In our simple toy model, the annual temperature jumps, compared to the previous year, by +dT with probability 50 percent, and decreases by -dT with the same probability. What is the probability that all 7 years 1999-2005 will see a strictly lower temperature than 1998?

William Connolley uses the same technique he uses for the "real" climate - namely Monte Carlo simulations. He wrote the following comment:

  • ... My calculation (by monte-carlo; I guess I should be able to do it exactly but I've forgotten how to if I ever knew) is that the chances are about 1/4 for equal up-down increments ...

OK, what is the correct result? Let us call the temperature in 1998 "zero" and let us choose units in which "dT=1". In 1999, "T=+1" with probability 50%. Only if it is "T=-1", we will have a chance for 7 cooler years, and it occurs with probability 50%. That's what's left in 1999. We will only consider the case "T=-1" because in the rest of cases, we have already violated our goal.

In 2000, the temperature will be either "T=0" or "T=-2". Both of them have probability 50% which means 25% of the total. We will talk about the (smaller) percentages of the total from now on. My original problem was defined so that "T=0" again (matching the record) already violates the conditions. So only 25% of cases, in which "T=-2" in 2000, will work.

Now you can already see that the final result will be below 1/4.

In 2001, it will be either "T=-1" with probability 1/8, or "T=-3" with probability 1/8, too. Both of them work.

In 2002, it will be "T=0" with probability 1/16 - which means a violation, or "T=-2" with probability 1/8 (combining twice 1/16), or "T=-4" with probability 1/16. Clearly, only the latter two cases with a total probability 3/16 survive.

In 2003, we will have "T=-1" with probability 1/16, "T=-3" with probability 3/32, or "T=-5" with probability 1/32. The total probability remains 3/16.

In 2004, the temperature will be "T=0" with probability 1/32, which means a failure, or "T=-2" or less with overall probability

  • (3/16-1/32) = 5/32

If the temperature in 2004 is "T=-2" or less, then it will be below zero in 2005, too, and we're all set. Once again, the total result is that assuming the Brownian motion uniform-step model, the probability that we get 7 cooler years, after any specific year (that we called 1998), equals 5/32.

What about the Monte Carlo models? Our senior computer modeller told us that the right answer is

  • 1/4 = 8/32

This is 60 percent above the correct result 5/32 (of course, the correct answer is what is taken to be 100%). Imagine that. A simple mathematical task involving one integer variable and seven mathematical operations "+1" or "-1" - a task that most of you could have solved analytically in the kindergarden. Even if neither you nor your nurse had known "probabilities", you could have listed those 128 equally likely histories (sequences of simple integers) and count how many of them satisfy the criterion. (A reader in the fast comment obtained this idea independently. A programmer could also use a computer to calculate the precise result by listing all 128 histories.) William Connolley had to use a computer with a Monte Carlo program (for William: programme), and he overshoots the correct result by 60 percent anyway.

What probability would you guess after half a minute of thought? 1/6? You would be 10 times closer to the truth than William Connolley with his "scientific approach" involving computers. Or 1/8, by phrasing the question so that the warmest year among eight in a more or less random sequence is the first one? You would still be 2.5 times closer to the truth than William.

Now imagine that you replace this funny model where we add 1+1 seven times in a row by a semi-realistic model of the climate that has billions of variables, thousands of physical effects (many of them hypothetical and many more probably missing) and hundreds of their mutual relations and feedbacks. These mechanisms involve many non-trivial cancellations that make the individual terms more important than in our case (so that a 60% error of anything is a disaster). You also improve the time resolution by three orders of magnitude, extend the predictions from 7 years to 50 years. Finally you give the new problem to William Connolley or his friends. What can you expect from their results if they're not even able to calculate 5/32 correctly? What you get is complete chaos, of course. Worthless numbers. Junk. Global warming. Methodological rubbish, using the words of Hans von Storch.

I suspect that they run their unrealistic computer games - that overshoot the global temperature anyway because they assume, among other things, that ice melts 1,000 times faster than it does - approximately three times, completely without any understanding, intuition or clue what's going on in the "black box", and if the third result is sufficiently politically correct and predicts a sufficient global warming to satisfy their brothers and sisters in the "scientific consensus", they promote the result into a scientific conclusion and their friends in the scientific journals happily publish this new kind of "science". This is what our society pays billions of dollars for.

Biased Brownian motion

We also mentioned the asymmetric case that William Connolley has surprisingly calculated pretty well - he obtained 1/10 while the correct result is 13/128. Imagine that the temperature increases by +1.5 with probability 50%, and decreases by -1.0 (centikelvin - but the choice of unit does not matter, of course) with probability 50%. What is the probability that 7 years after 1998 will be cooler? We already know the method, so let's list the percentages:

  • 1999: T=-1 (50%), otherwise T non-negative
  • 2000: T=-2 (25%), otherwise T non-negative
  • 2001: T=-3 (12.5%), T=-0.5 (12.5%), otherwise T non-negative
  • 2002: T=-4 (1/16), T=-1.5 (1/8), otherwise T non-negative
  • 2003: T=-5 (1/32), T=-2.5 (3/32), otherwise T non-negative
  • 2004: T=-6 (1/64), T=-3.5 (1/16), T=-1 (3/64), otherwise T non-negative
  • 2005: T=-7 (1/128), T=-4.5 (5/128), T=-2 (7/128), otherwise T non-negative

The total of the surviving probabilities is 13/128 which is indeed close to 1/10. Once again, assuming Brownian motion with the up/down steps in the ratio 1.5 vs. 1, there is only 10% probability that after a year like 1998, all seven following years will be cooler.

You should not be shocked that the number 10% is so small. The same conclusion applies to other years, not just 1998, and for most others, the criterion is satisfied and they are not followed by a period of 7 cooler years. Everything morally agrees with the rules of chance, of course. In the short term, you can't really see any problems with the Brownian motion model, except that it is probably better than the "independent years" model.

William obtained a pretty good numerical result for the second task - and you may ask whether it is good news or bad news. It could be good news because he can calculate at least something. Let me offer you an alternative explanation. It is a bad news because it shows that he may have had the correct program but he does not know how to use it. More precisely, he does not understand that he must repeat the simulation a sufficient number of times for the average of his results to be close to the actual results. And he should actually try to figure out what is the variance of the results generated by the Monte Carlo methods, as long as we want to call it repeatable science.

William's large numerical error is kind of baffling because one second of computer time should be enough to find the correct results with the accuracy of several digits. Do the math. Repeat the simple history "N" times (less than a microsecond in this case!), and the relative error should go down like "1/sqrt(N)". You should get 3 valid digits within 1 second (million of microseconds).

This seems to confirm the hypothesis that no one has yet told them that they should estimate the error margin of their climate predictions. It really seems that they imagine that "Monte Carlo" programs should behave just like the casinos in Monte Carlo. Try your good luck five times, and when you're lucky and you win $100,000 or +8 Celsius degrees of warming per century, go home, turn your computer off, and celebrate! (And publish it, too.)

The CPU time could have been too expensive for William to reduce the error below 60%. But one can then ask How many times did they actually run the "real" climate models that have a million times as many variables and many thousand times as finer time resolution? Once? Or twice, choosing the "better" answer? And this single run not only predicts the future but also simultaneously validates the thousands of assumptions and parameters in the model, does not it? Because the calculation leads to the Big Crunch singularity of global warming and moreover also shows the warming in the present era, then the hundreds of assumptions must be correct, right?

Meanwhile, in India, as of today, the recent extreme record cold has killed 200 people. If you help to pay the money to Kyoto+ protocols designed to cool the planet, you may help to double such numbers. You will also help to further improve the snow record in Japan.

Add to del.icio.us Digg this Add to reddit

snail feedback (13) :


reader Belette said...

Hi Lubos. All good knockabout stuff I guess. My calc is correct if you allow later years to be equal to the first; you insist on strictly less. This is based on 10,000 sims, though its pretty well the same at 1,000.

Now then... you assert that (for the biased BM) "You will find out that the probability that you won't defeat the warmest year for 7 years is tiny". Now we all agree its 13/128: this isn't tiny. Its not even past a 95% confidence limit.

Which is just as well: because if it was, there would be something interesting to explain, if your calculations made physical sense. It would have to mean that GW wasn't occurring. And if that were true, it would be hard to understand why you aren't prepared to bet on that conclusion...

And... from what you write, its pretty clear you've got no idea how GCMs work. You can try http://mustelid.blogspot.com/2005/11/how-coupled-ao-gcms-work.html if you like.


reader Lumo said...

Dear William,

you seem to miss what the word "probability" and "statistics" means and how it is calculated. After all, you wrote it yourself, so we agree about that.

It is nonsensical to eliminate years with the 95% confidence limit when you're doing climate reconstructions or predictions - although your words suggest that this is exactly what you would be doing.

You can be sure that roughly 5% of the years or so will be outside the 95% confidence limit. For example, this includes 5 years since 1900 and 50 years from the year 1000. The whole period 1956-2005 can be outside the 95% band among the years 1006-2005. Nothing could be derived from such shaky statistics. It is simply impossible to postulate that all years must satisfy a condition that only 95% of the years are expected to satisfy.

Your explanation of your wrong result for this simple mathematical problem is not gonna convince anyone. You're just trying to "raise" the correct results a posteriori to better match your wrong calculations. If you imagine that you overlooked the word "strictly" that was written everywhere where we defined the problem, that's already pretty bad, bad you will still get different results from the correct ones.

For the symmetric "model", the correct result would be 31/128, which is pretty close to 1/4 albeit not exact. But for the asymmetric model, you get 15/128. That's 20% above your result. Note that you keep on repeating the result 13/128 which is the result for the "strictly lower" asymmetric problem, which proves that you are unable to calculate these things, despite having a program, even at this very moment.

I find it rather incredible.

If someone - officially an "outsider" - found me that I am being a complete idiot in the basics of my field, just like it happened to you, I would at least learn how to calculate these things better than the other party. You don't have this driving force.

Whatever you say about the problem you solved, you can never make it so that your errors were below 15%. At any rate, your results are as bad as random guesses of a person who has at least a small intuition. The only reason why you use computer is that you are unable to think, even about very simple things - and the computer at least lifts you to the level of an average human. But you don't actually understand what's going on.

You have given us a flavor how the GCMs work, but especially how the people who run them work.

The probabilities like 10% are unimpressive indeed, because of the definition of the problem - the original question used the number 1998 as a parameter. If we look at other years, we would really see that roughly 80-90% of them satisfy what 1998 does not: at least one year among the 7 that follow is warmer.

But if the year 1998 were "canonical" and "exceptional", or if the cooler years are going to continue, then it would be a hint that a theory about a trend is not terribly well supported by the data.

Comments that I am not allowed to count the temperature of 1998 because it was a "special year" just mean that the people who say it don't respect their own definitions. Of course, it is always possible to twist the global annual temperature by various taxes, social welfare programs, etc. so that the resulting corrected temperatures follow any pattern you want. But it is not science. If we talk about global temperatures as defined today, El Nino is just one of the things that they legitimately capture, and if we want to predict them, we must also carefully take El Ninos into account. Pretending that the climate never has El Nino is scientifically unjustifiable.

Best
Lubos


reader Lumo said...

A few more numbers showing that your justification of your huge error is absurd. What I want to say is that you could not have made a correct calculation.

If you did 10 thousand simulations of the asymmetric problem with the assumption that the temperatures are allowed to match 1998 (or be lower), you would have obtained 15/128 with the relative error of order "inverse square root of 10,000".

This would be a 1 percent error. You can never justify a 20 percent error that you actually obtained. The summary is that you don't know how think, so you try to use computers. But you don't know how to use computers either. Even if you learned how to use the computers, you don't know how to interpret the results they give you.

Just a little bit of inspection from outside shows that your field and the thinking of you and your colleagues is completely disordered.

Best
Lubos


reader MacroMouse said...

Hi Lumo,

not that it makes a big difference, but my calculation of "symetric model with zeros accepted (x<=0)" shows the result of 35/128, not 31/128...

Does not the missunderstanding between you and Belette stem from the fact, that the you redefined the original "brownian motion" problem to a "discrete binomial exercise"??? Normal brownian motion would have some continuum of possible changes (e.g. normally distributed) and not only +-50%... Solution to real Brownian motion problem would not have to deal with the "acceptable and not acceptable zeros" etc...

BTW I don't have currently tools available to do the 7-dimensional integral which is needed to really solve the brownian motion problem in this specific case (can you do it analytically??), but some simple excel simulation shows results around 21%, i.e. closer to 1/4 than to 5/32...


reader Lumo said...

Dear Macromouse,

concerning the correct problem, I now agree with you that the the correct result is 35/128, which means a 9% error for William - also pretty bad.

The counting of histories among 128:

1999: -1 (64)
2000: -2 (32) 0 (32)
2001: -3 (16) -1 (32)
2002: -4 (8) -2 (24) 0 (16)
2003: -5 (4) -3 (16) -1 (20)
2004: -6 (2) -4 (10) -2 (18) 0 (10)
2005: -7 (1) -5 (6) -3 (14) -1 (14)

Concerning the Gaussian, I just can't understand sloppy thinking like yours. I've defined the problem absolutely rigorously, told you what the incremenets are and what is their probability, and what inequalities must be satisfied. The increments have always been +1 or -1, and William was attempting to solve the same problem.

It is legitimate to call this model Brownian motion because the long-term statistical behavior of it is identical to any Brownian motion you have heard of. The short term behavior is never Gaussian. For example, velocities in the actual Brownian motion explained by Einstein are piecewise constant in between the colissions. It is just a long-time-scale approximation when all distributions become Gaussian by the central limit theorem.

If you want to solve a completely different problem where the distribution is Gaussian, you can solve it, too. But it has absolutely nothing to do with this particular problem we were solving, and there is no way how can you compare the results to two different problems.

In the asymmetric case, it is not even well-defined what it means to compare the problems because you would have to find a relation between the ratio of my increments and the "bias" of your gaussian in the units of the standard deviation.

By the way, aren't you the same person whom I sent the things above in the e-mail?

Best
Lubos


reader Lumo said...

To avoid misunderstanding, let me mention that the "correct problem" in the previous text means "the problem with fixed increments but with the strict bound replaced by the non-strict bound".

To summarize the results.

Strict bounds (as originally defined): 5/32 (symmetric), 13/128 (asymmetric)

William's errors: 60 percent, 3 percent

Non-strict bounds: 35/128 (symmetric), 15/128 (asymmetric)

William's errors: 10 percent, 20 percent


reader MacroMouse said...

Lumo,

not that it is the most importantant thing on the Earth, but your original problem was:
"From this Brownian viewpoint, it is shocking that the Earth was not able to improve the record temperature for 7 years. Try to do the calculation."

You continued with an simplified model:
"Assume that every year, the temperature either goes up by +dT, or goes down by -dT.

In other words the original problem was Brownian motion, the simplified solution, which was examined, was a discrete binomial distribution...

I agree, that we now all agree, what is the correct sollution to the currently well defined "problem with fixed increments", but the original question on "what is the probability if the climate behaves like Brownian motion" is still not answered.

"By the way, aren't you the same person whom I sent the things above in the e-mail?"
Probably not...


reader Lumo said...

Dear Macromouse,

I also hope it's not such a big deal - but there was absolutely no ambiguity what was the distribution for the step in my toy model. It was either +1 or -1.

You're not right that this toy model should not be called a model of Brownian motion - although the words "random walk" would have been more accurate. It is as good a Brownian motion as the water molecules hitting the dust particles - where the changes of the velocities are also discrete at short time scales, and the Brownian motion statistics only occurs at long time scales.

I chose the discrete microscopic model of the Brownian motion (or random walk) because the qualitative conclusions are indentical while it is much easier to calculate than the exact Gaussian integrals that would require to include moments of the error function.

Your own inability to calculate the result for the Gaussian case demonstrates pretty well that I have made the right choice when I decided to define the Brownian motion model using the binomial microscopic rules. In physics, it is often important to find the right model and/or approximation where progress can be made and at least approximate understanding may be found.

But what I find strange is that we have absolutely clear rules what models we are solving, yet you decide to spread something that I apologize should be called fog and pretend that one should actually solve a different model to answer my question. I just don't understand why you find such an approach so easy.

It's just like in Nature. The rules of the game are completely well-defined. Our models should match the actual reality with everything it has. If you solve a different model - for whatever reason (because it is mathematically easier or on the contrary mathematically challenging) and get very different results, it's your problem. Nature does not care about it and She has her own rules.

In the toy model, we had clear rules, too.

This approach of spreading fog is exactly what William used in the previous article, too. He had to know that he did not have the correct results, yet he decided to generate some fog about the question whether I had calculated the problem myself. Of course that I had. This is what these charlatans are doing all the time. They just produce a guess whose purpose is to shed negative light on some of their opponents, or question their opponents' ideas or calculations.

They don't know whether it is the correct result, but they hope that it will do the job to discredit their opponents and that no one will be able to find that they are liars. Their fraudsters like Hwang, just the relevant questions are different and less well-defined.

Unfortunately for Mr. Connolley in this case, it was a matter of 3 minutes to show that he was not saying the truth.

Best
Lubos


reader MacroMouse said...

This will be my last post on this because it seems, that we start to move in circle.

Lumo said:
"But what I find strange is that we have absolutely clear rules what models we are solving, yet you decide to spread something that I apologize should be called fog and pretend that one should actually solve a different model to answer my question."

Just to remember, your post begins with: "In the previous posting, we tried a couple of thought experiments. One of them was to imagine that the "global temperature" behaves like Brownian motion"

One would thing that after 2 centuries the Brownian motion problem would be well defined - let's take e.g. the Wikipedia definition:
"Mathematically, Brownian motion is a Wiener process in which the conditional probability distribution of the particle's position at time t + dt, given that its position at time t is p, is a normal distribution with a mean of p + μ dt and a variance of σ2 dt; the parameter μ is the drift velocity, and the parameter σ2 is the power of the noise."

We (i.e. me, you and Wikipedia) will agree, that discrete "random walk" relates to "Brownian motion", e.g. Wikipedia says:
"Brownian motion is the scaling limit of random walk in dimension 1. This means that if you take a random walk with very small steps you get an approximation to Brownian motion."

So hopefully we will agree, that we were able to study "an discrete approximation to Brownian motion"...

Have fun (= I don't want to argue, it is not worth it...).


reader John G. Bell said...

Luboš, looking back at a coin toss trial and finding say 5 heads in a row doesn't tell us that the odds weren't 50 50. You can't cherry pick data like that. Now if we decide to look at the next 8 years and they all get colder we would have something to talk about.

Willam Connolley looks foolish mostly in not pointing this out.


reader Lumo said...

Yes, yes, I should have used the words "random walk" instead of "Brownian motion". Still, I believe that the problem was well-defined. I don't want to argue either. ;-)


reader Lumo said...

I completely agree that it is cherry-picking to start with 1998 in playing this game, and the main reason why William does not repeat it is that I wrote it myself.

Comparing our current temperatures to the little ice age - and other things - is the same amount of cherry-picking. Best wishes, Lubos


reader Quantoken said...
This comment has been removed by a blog administrator.