MSNBC and others run a story about the asymmetry between the number of daily hot records and the number of daily cold records in the U.S.
Note that in the 1960s and 1970s, the cold records beat the warm records approximately by a 1.3 to 1 ratio.
The latest ratio of warm vs cold records is 2 to 1.
If there were no trend anywhere and whatsoever, you would statistically expect a 1 to 1 ratio, of course.
But what the media don't seem to convey is the sensitivity encoding how much the ratio depends on the number of years on the record and on the ratio between the trend and the annual random fluctuations. Clearly, these are the only two parameters that determine the ratios.
Some simple simulations
Let's assume that there are 100 years on record. Also, describe the temperature on a given day to be an underlying trend plus a random number without autocorrelation (white noise) which is appropriate for local daily temperatures. We want to know whether Friday, November 13th, beats the previous ninety-nine November 13th readings in Cambridge, MA and becomes the warmest November 13th, and the corresponding probability of the coldest reading.
If there is no trend (and the noise is white), both probabilities are evidently equal to 1% because each year among the 100 years has the same chance to be the record holder (both for warm and cold records). From now on, let's assume that the underlying trend is 0.6 °C per century. And we will list the ratios for different standard deviations of the white noise, a random number that is being added to the daily local temperatures in each year.
If the standard deviation is
* 0.0 °C (no noise), you get 100% of warm records, 0% of cool records, and an infinite ratio
* 1.0 °C (small noise), you get 2% of warm records, 0.5% of cool records, 4:1 ratio
* 2.0 °C, you get roughly 1.4% chance of a warm, 0.7% chance of a cold record, and the 2:1 ratio
* 3.0 °C, you get 1.3% warm, 0.85% cool, 1.6:1 ratio (or so)
* many °C, you get 1% warm, 1% cool, 1:1 ratio
The figures are approximate. I could probably calculate them analytically, without my Monte Carlo program, but let's not waste too much time now.
You can see that the standard deviation of 2.0 °C (with the 0.6 °C underlying centennial trend) gives a similar ratio as the 1990s in the U.S. The warm-to-cool record ratio dramatically decreases from infinity:1 towards 1:1 as the noise-to-trend ratio increases.
If you replace a warming trend by a cooling trend, the roles of the warm and cool records get interchanged, of course.
Extending the chronicles
What happens if you change the number of years? Let's go from 100-year records to 200-year records. Clearly, it becomes tougher to beat the records. For no warming trend, you will get 0.5% chance of a warm records and a 0.5% chance of a cold record. Let's keep the 0.6 °C warming trend per century i.e. 1.2 °C warming trend per two centuries. What happens with the table above?
If the standard deviation of the daily noise is
* 0.0 °C (no noise), you get 100% of warm records (unchanged), 0% of cool records, and an infinite ratio: unchanged
* 1.0 °C (small noise), you get 1.6% of warm records (slightly decreased, less than by a factor of two), 0.05% of cool records (hugely dropped, much more than twice), 30:1 ratio
* 2.0 °C, you get roughly 1.1% chance of a warm, 0.2% chance of a cold record, and the 5:1 ratio
* 3.0 °C, you get 0.9% warm, 0.3% cool, 3:1 ratio (or so)
* many °C, you get 0.5% warm (dropped to one-half), 0.5% cool (dropped to one-half), 1:1 ratio (fixed)
So what's going on if the number of years is being increased? Well, if there is a warming trend, it is becoming increasingly likely that the warm record occurs in a "recent year" while the cool record occurs in a "distant past".
Concerning the particular percentages, the probability of a warm record is not changing much. If the "integrated trend over noise" ratio is high enough, the probability of a warm record converges to a constant that is independent of the number of years on the record: that's what you see at the top of the table (small noise).
On the other hand, the probability of a cool record decreases brutally if the number of years on the record is being increased. Well, the temperatures were really lower in the 1880s and it becomes increasingly unlikely for the "current" white noise to beat the coldest readings of the 19th century.
For the cool records that become very rare, their probability starts to depend primarily on the "integrated trend over noise" ratio, regardless of the number of years on the record. And the probability reflects a normal distribution that is suppressed in the Gaussian way if the "integrated trend over noise" ratio becomes substantially greater than one.
So getting a "50 to 1" ratio of the warm records and cool records would be nothing extraordinary. It would simply mean that for many years, the Earth stays detectably warmer than it used to be in the 1880s etc. The large deviation of "50 to 1" from "1 to 1" simply expresses the certainty that the current climate is warmer than in the 1880s.
But the number "50" isn't proportional to any "severity of the impact of warming". On the contrary, the "1/50" ratio is naturally dropping, faster than exponentially (in the Gaussian way) as a function of the integrated warming trend. So it simply can't be shocking that it is large.
Because the warm-to-cool record ratio only sits around 2:1, you might say that even the very statement that the Earth is warmer than it was 100 years ago is supported just by "pretty weak statistics". The "accumulated warming" since the beginning of the records is just becoming comparable to something like one quarter of the "daily temperature fluctuations" (see the more precise data above).
So the underlying trend can already be statistically isolated from the noise, by taking a large number of days and stations into account, but the warming doesn't yet influence a single weather station on a single day because the accumulated warming is still substantially smaller than the daily variations.
I would like to emphasize that one needs to do the maths well and stay rational about all these things. Various statistical quantities that depend on the warming are highly nonlinear functions of the warming trend and we shouldn't blindly identify them. For a fixed standard deviation of the noise, the probability of a new cold record is an extremely quickly decreasing function of the "accumulated warming". This fact follows from mathematics - from basic properties of the normal distribution - and no rational person could see any "warning sign" in this mathematical fact, as MSNBC irrationally screams.
This statistical treatment of warm and cold records can be a pretty convenient way to express and parameterize the statistical significance of the underlying warming/cooling trend relatively to the noise. But one warning deserves a special sentence: none of these data tell us anything about the causes of the underlying trends.
See an extremely simple Mathematica notebook for simulations that generated the numbers above: NB file, PDF preview.