## Thursday, January 15, 2009 ... /////

### Record-breaking years in autocorrelated series

As Rafa has pointed out, E. Zorita, T. Stocker, and H. von Storch have a paper in Geophysical Research Letters,

How unusual is the recent series of warm years? (full text, PDF; see also abstract),
in which they claim that even if we consider temperature to be an autocorrelated function of time with sensible parameters, there is only 0.1% probability that the 13 hottest years in the list of 127 years (since 1880) appear in the most recent 17 years, much like they do in reality according to HadCRUT3/GISS stations.

If we add a non-autocorrelated noise, typical for local temperature data, the temperature readings become more random and a similar clustering of records becomes even less likely because the autocorrelation that keeps the probability of clustered records from becoming insanely low is suppressed. This matches the reality, too, because local weather records usually don't have that many record-breakers in the recent two decades.

What percentage of civilized planets shoot An Inconvenient Truth?

But after detailed simulations, I am confident that the main statement of their paper about the probability in the global context - 0.1% (that would strongly indicate that the recent warm years are unlikely to be due to chance) - is completely wrong.

The correct figure for the global case is between 5-10% (depending on the damping of the long term memory, and we will argue that the 10% figure is realistic at the end), if you allow record cold years as well as record hot years, which you should because both possibilities could feed alarmism. If you ask about strict record hot years only, pretending that the alarmists wouldn't exist if we were breaking record cold years :-), you should divide my probability by two.

The last alarmist planet I generated: temperature anomaly in °C in the last 127 years. About 10% of randomly generated realistic temperature data look like this and satisfy the 127/13/17 record-breaking condition - by chance. Click to zoom in.

At any rate, the probability is rather high and it is completely sensible to think that the clustering of the hottest years into the recent decades occurred by chance. In roughly one decade per century, we get the opportunity to see this "miracle" (13 hottest years occurring in the last 17 years).

The odds are pretty much unchanged if you add two years (2007,2008) to their paper, i.e. if you go from {127,13,17} to {129,15,19}.
Here is how I calculated it

I've generated thousands of "planets". By a planet, I mean its temperature series, namely 127 random numbers describing the global mean temperature anomaly (deviation from a hypothetical long-term average) between 1880 and 2006. (It's simple to change these numbers in the notebook.) So how does one temperature series look like?

The numbers in it are random but they are autocorrelated because the temperature has some inertia. A year is unlikely to have a radically different temperature than the previous one because the temperature is continuous, after all. I am absolutely convinced that the correct model is a random walk or a Brownian motion, if you wish. But it must be damped. Why? What do I mean?

Damped random walk

I mean that I determine the temperature anomaly in the year Y from the temperature anomaly in the year Y-1 by adding a random number between -0.5 and +0.5.

The overall scaling of this interval doesn't matter for the record-breaking questions because it only determines the units of the temperature anomaly. So let us keep it at -0.5...+0.5 at all times, except for the very final paragraphs of this text, assuming that the reader has no difficulty to rescale this whole discussion into different units of temperature. The precise distribution from which we choose the step wouldn't matter much either.

Now, you should immediately protest. Such a model of the climate would resemble the Brownian motion. After a very long time "t", the typical temperature anomaly would scale like "sqrt(t)". It would drift away. That's unrealistic because "sqrt(4 billion years)" would still be many thousands of degrees. ;-) The number "sqrt(t)" is very small compared to "t" but it still diverges if "t" goes to infinity.

There must clearly exist and there do exist mechanisms that prevent the temperature anomaly from going infinitely far.

So what I do add is the damping. To determine the temperature anomaly in the year Y, we add our random number between -0.5 and +0.5 to the reading from the year Y-1. But we follow this step by another step, namely by multiplying the temperature anomaly for the year Y by a factor "damping" that is chosen to be 0.99 in the recommended calculation in the notebook (and as 0.9995 later, as explained at the end). This "damping" reflects the climatic mechanisms that try to heat the planet up if it gets too cold (e.g. by absorbing more of the Sun's heat than the Earth emits thermally) and vice versa.

It turns out that even though this "damping" parameter is very close to 1, this step completely stabilizes the long-term behavior of the temperature and keeps the deviations finite. (The Brownian divergences would return for "damping=1", of course.) By Monte Carlo methods, I have calculated that the long-term standard deviation of the temperature anomaly actually goes like
SD = 0.2 / sqrt(1-damping)
Again, this is for the annual step being a random number between -0.5 and +0.5. It shouldn't be hard for you to rescale the random step and the standard deviation by the same factor. I have done extensive and accurate enough statistics to argue that for a "damping" parameter very close to one (i.e. a damping that operates very slowly), the formula above actually becomes exact, despite the surprising factor of 1/5 in the numerator.

(Pavel Krapivsky convincingly argues that the correct numerator is actually not 1/5=sqrt(1/25) but rather sqrt(1/24).)

I could probably prove the formula analytically but Monte Carlo should be enough for our "applied" purposes.

So if the annual jump we started with is between -0.5 and +0.5 °C which is reasonable and if we choose "damping=0.99", the very long-term standard deviation of the temperature anomaly will be simply 0.2 °C / sqrt(0.01) = 2 °C. These figures sound completely sensible. You could actually increase "damping" a little bit, and we will do so at the end. I also claim that the predominantly Brownian motion of the climate is consistent with all tests of autocorrelation that have been done.

With the numbers {127, 13, 17, 0.99), I get about 7% of the "planets" having the record-breakers in the most recent years. And the percentage converges to 10% or so if we choose a higher value of "damping". Autocorrelation (or "inertia") is enough to make the clustering of record-breakers a mundane event.

And that's the memo. Thanks to Rafa for pointing out the article by Zorita et al. to me. By the way, the damped random walk is known as an AR(1) process and it is "wide sense" (weakly) stationary.

Bonus: realistic numbers

Finally, I decided to create a completely realistic picture of the planet, with the right parameters and normal distributions:
Mathematica notebook (PDF review)
The notebook also shows the picture of the "last alarmist planet" that we used above.

In this improved version of the notebook, the annual jump is taken to obey a normal distribution. The standard deviation of this normal distribution equal to sdJump=0.1082 °C was extracted from the annual GISS data. For a normal distribution, the long-term standard deviation of the temperature anomaly itself is close to sqrt(1/2)*sdJump/sqrt(1-damping), and if we want this ratio to be close to a realistic deviation of 3.4 °C (which would simply attribute a major part of the glaciation cycles to the Brownian noise) and if we realize sqrt(1/2)=0.7 or so, we have to choose damping=0.9995. With such a large damping, the temperature dynamics in 127 years is essentially indistinguishable from an exact Brownian motion.

Correspondingly, the probability to see 13 record-breakers (either hot or cool) in the most recent 17 years out of 127 years exceeds 10%, about 100 times higher than the probability 0.1% claimed by von Storch et al.! And it is still above 5% if you didn't consider 13 record cold years equally alarming. Either your humble correspondent or von Storch et al. have seriously screwed the calculation, didn't they? ;-)

Meanwhile, the statistics of recent record-breaking years doesn't provide us with any statistically significant evidence for any trend that would deviate from mundane randomness.

A detail: initial conditions in 1880

In the older versions of the notebook, I set the temperature anomaly at the beginning, in 1880, to zero (which is the long-term average). If you set it to a random value in the long-term distribution, by
a[[1]] = RandomReal[ NormalDistribution[0, sqrt(1/2) * sdOfRandomAnnualStep / Sqrt[1 - damping] ]];
before the "For i=2" cycle (also, an "a[[t]]<0" condition had to be changed to "a[[t]]<a[[1]]" later), the probability around 10% is almost unchanged because the dynamics in 127 years is "almost always" just a Brownian motion, anyway, and the damping (and initial conditions) don't matter.

For "damping" further from one, the initial conditions would matter because it would sometimes be easier/harder to surpass the warm/cool years at the beginning of the period. But the difference would only occur at the second order, and I am not sure about the sign.

Game: Brownian temperature changes at different timescales

I want to assure you that the random walk model is extremely powerful as a rule to understand the expected random changes of temperature over various time scales. As we have explained, the GISS data explain that the annual (1-year) change of the temperature is typically by 0.108 °C. Let's call it 0.1 °C. Because the distances in random walk are proportional to the square root only, we obtain the following table of expected temperature changes (standard deviations of the differences of temperatures between moments separated by the time indicated):
• 1 year: 0.1 °C
• 100 years: 1 °C
• 400 years: 2 °C
• 900 years: 3 °C
• 1 600 years: 4 °C
• 2 500 years: 5 °C
• 10 000 years: 10 °C
In the last line or slightly earlier, the "damping" starts to be very important so the actual changes are somewhat smaller than indicated. But the glaciation cycles can be almost entirely due to randomness hiding in the random walk.

When I drew the actual "typical jump" over N years as seen in the GISS data, the jump actually didn't increase at all between 1 and 4 years. That may be an indication of a non-Brownian, quasi-periodic, ENSO-like variability that contributes to these short-term jumps. However, above 4 years, the sqrt(time_separation) rule for the temperature jump was working pretty well. Nevertheless, this "stagnation" between 1 and 4 years indicates that you should divide all the temperature jumps above 1 year in the table above by a factor of two, starting with 0.5 °C per century and ending with 5 °C for 10,000 years of separation. It makes perfect sense, doesn't it?

I believe that all these observations are counter-intuitive for most people because everyone tends to imagine one kind of "randomness" only - where the sign of the temperature anomaly should pretty much alternate. However, it is really the temperature change, and not the temperature itself, that behaves randomly, as a white noise, and that implies completely different results. As we have seen, the temperature anomaly may remain positive for many thousands of years.

Finally, I want to say that this article dealt with the "weakly stationary" climate where the "expected" temperature doesn't change. There are all kinds of effects that invalidate this assumption - solar, cosmic, man-made, switching modes of the oceans, etc. But the point of this exercise was to see that there can be a lot of intense, trend-like temperature evolution for long periods of time (such as decades and centuries) without introducing very unlikely effects and without threatening the long-term stability of the Earth's climate, plus minus 5-10 °C, that is important for the survival of life.

The "time constant" indicated by Stephen Schwartz is only 5 years, hinting at "damping=0.8" or so. However, this might apply to oceans only. The land can have much more freedom to heat up or cool down, correspondingly to its much lower effective heat capacity (of the layers that get mixed efficiently). Both lands and oceans interact, after all, so this combination would probably be a tougher system to analyze.