Increased Variability?

In the last post I mentioned that a change in either the mean value or the variance of a probability distribution will greatly affect the probability of extreme events. I also mentioned that a change in variance has a more profound impact on the likelihood of extremes than a change in mean value.

One of the things which led me to believe that we have witnessed a change in variance is a paper by Hansen et al which discusses many aspects of temperature change, including variability in temperature. The basic kind of graph (of which there are many versions) from which they conclude increased variability is this one (left graph of figure 9):

The caption describes the graph thus:

Frequency of occurrence (y-axis) of local temperature anomalies divided by local standard deviation (x-axis) obtained by binning all local results for 11-year periods into 0.05 frequency intervals. Area under each curve is unity. Standard deviations are for the indicated base periods.

Note that the most recent 11-year period shows much greater spread than earliest ones. The “conclusions” section specifically mentions (my emphasis):

Seasonal-mean temperature anomalies have changed dramatically in the past three decades, especially in the summer. The shift of the probability distribution (Fig. 9, left) is more than one standard deviation. In addition, the probability distribution broadens, the warming shift being greater at the high temperature tail of the distribution than at the low temperature tail.

I used the temperature data from climate divisions of the U.S. mainland (USA48) to estimate the variability in temperature by a similar (but not the same) method and got a different result, which caused me to wonder why. I might have figured it out, but I could be wrong.

There are three panels in figure 9, the other two show their result when using a different baseline period than the 1951-1980 standard, here’s the third one:

Note that the most recent 11-year period no longer shows dramatically greater spread than earliest 11-year periods. They ascribe the different result to this:

The effect of alternative base periods on the frequency of temperature anomalies is shown in Fig. 9. Use of a recent base period alters the appearance of the probability distribution function for temperature anomalies, because the frequency of occurrence is expressed in units of the standard deviation. Because climate variability increased in recent decades, and thus the standard deviation increased, if we use the most recent decades as base period we “divide out” the increased variability. Thus the distribution function using 1981-2010 as the base period (right graph in Fig. 9) does not expose the change toward increased climate variability.

I believe they’re mistaken.

Using a different scale factor (standard deviation) would indeed reduce the variability, but it should do so for all time periods equally. Hence the relative variability would be preserved — both early and late decades would show less variability, but late would still show more than early.

Why then the difference? I think it’s because the distributions are estimated from temperature records for many small regions covering a large area (in this case, northern hemisphere land), and different regions within that area have warmed differently. This can cause the combined distribution to widen even if no single region has experienced increased variability.

An example may illustrate. Suppose our area is composed of only two regions. During the baseline period their distributions will necessarily have the same mean value, since we’re using anomalies relative to that baseline. Suppose also that anomalies in both regions follow the normal distribution, and both have the same standard deviation. Then during the basline period the distribution of the combined area will be that of either region.

Now suppose that one region warms by 1 standard deviation while the other does not (or more generally, that they simply warm by different amounts), but that there is no change in variability in either region. Due to the change in mean, their distributions are no longer the same — let me graph them like this:

Now let’s compare the distributions for the combined area before (in black) and after (in red) the change:

The combined-area distribution has widened considerably. But this isn’t because either region has increased temperature variability. It’s because for the baseline period both regions have the same mean value so their distributions coincide, but for the later time period they have different means from having warmed overall by different amounts. Combining the two identical distributions with different means yields a distribution with increased dispersion.

I think a better way to gauge the evolution of temperature variability (apart from average temperature, which we know has changed) would be to take each small-region record, and instead of just computing anomaly, actually de-trend the anomalies. I did this, using a modified lowess smooth for the de-trending so I could remove nonlinear trends. This truly isolates the variations from the trend. I then scaled the de-trended anomalies by the standard deviation for each month, since winter months show much higher standard deviation than summer months. This defines de-trended standardized anomaly, which I then studied to look for changes in variability over time.

Subjecting the de-trended standardized anomalies for all 344 climate divisions in USA48 to the same calculation described in Hansen et al. gives this

There’s no visible sign of any change in the amount of temperature variability from one 11-year period to any other. Also, there’s no issue about baseline period, since de-trending the divisional anomalies before standardizing and combining removes the influence of the choice of baseline period (but in case you’re interested, the baseline period I used was the entire time span 1895-2012.5).

That doesn’t mean there’s no change at all in the variability of temperature for this area (USA48). In fact I think there are much better ways to search for variability change in these data, than just estimating the probability functions for visual graphical comparison. Also, the graphs from Hansen et al. are for individual seasons whereas my graph is for all months of the year (seasonal patterns are a subject I have yet to investigate) But this much seems clear to me: any variability change which may be present is far smaller than indicated by the analysis of Hansen et al.

As I said earlier, it’s possible I might be mistaken. I could have misinterpreted the procedure they followed. There could be some flaw in my own method of using de-trended standardized anomalies, of which I’m not aware. I haven’t yet studied the data by individual seasons. I sure think there’s a much better way to scan the data for signs of variability change, and to test whether such change is significant. But when it comes to increased variability in month-to-month average temperature for the USA48 area — I’m skeptical.


22 responses to “Increased Variability?

  1. That’s because you’re a true skeptic, not a fake one, Tamino. Great job.

  2. Have you done a similar analysis for the global data ?

  3. Off-topic. Could you please comment on this recent paper I hope I have not missed it, if you have already done so. Thank you! I’ve seen it on a couple of “skeptic” blogs, and was confused.

    [Response: This was discussed extensively at RealClimate (also in the comments). I don’t think I could add anything substantive to that discussion.]

  4. I’ve always been a little worried by all the focus on whether or not the distribution is wider for the same sorts of reasons–it’s very dependent on your choice of baseline. More importantly, you get more frequent “extreme” events simply by shifting the mean of the distribution without having to rely on proving a broader distribution.

    I eyeballed some simulated August temperature distribution to get a reasonable gamma distribution for daily max temperature. If you simply shift the mean by 1C (leaving other distribution parameters the same) you see a 140% increase in the probability of a daily high of at least the historical mean +5C (it’s even more with a normal distribution but those aren’t very reasonable).

    Also, I imagine that temperature distributions would widen to some extent because of climactic shifts, but I’d (very naively) guess that this would be because rainy/overcast days would give a longer low-temperature tail rather than a high-temperature tail. The little anecdotal evidence that I have seems to suggest that high temperature weather, at least in the summer when we’re worried about extreme high-temperature events, runs into some degree of negative feedback when hot air creates big storms.

  5. Your “last post” link links to this post, not to ‘craps’.

    [Response: Thanks, fixed.]

  6. K.a.r.S.t.e.N

    When I first saw Fig.4 in this draft paper, I simply couldn’t believe it. In other words, I was very skeptical about it too, but didn’t pay further attention as it hasn’t been through the peer review yet. If your seemingly plausible objections turn out to be correct (which – given your first two exemplary plots – seems quite likely to me), their main message becomes somewhat questionable (to put it mildly). Thanks Tamino for digging deeper!
    In any case, Jim Hansen has a short comment about the principal Figure and the publication process online: I now wonder, whether this issue is going to be picked up in the PNAS peer review process …

  7. I get how the different rates of temperature increase can cause the seeming change in variability, and have just completed the same in Excel (two full alphabets of “regions” over 100 years, at different rates – I used cumulative counts for stdevs away from the mean of a given 20 year period). What I don’t get is how Hansen et al’s figures were so different as they moved the baseline. Even with different warming rates, why should the final decade all of a sudden have “decreased variability?” Or, why are they no longer in order?

  8. Tamino, thank you for revisiting this topic. It is something I’ve wondered about more than a little since you argued that you could not detect increased variance in the US temperature record.

  9. Hansen, Sato & Ruedy are very explicit why they chose summer temperatures – it’s when most biological activity occurs, and there is less year-to-year variability in temperature. This is greatest in the cooler months in each hemisphere as heat is exchanged between the equatorial and mid latitude regions – changes in wind direction can force either warm (tropical) or cold (polar) air masses into mid-latitude regions, leading to large annual fluctuations over these cooler months. Increased solar radiation in each hemisphere during summer greatly reduces this latitudinal gradient. Moreover, examining summer temperatures, not winter ones, would be more useful in understanding if/why there has been such a large increase in record-breaking heat in the last few decades.

    This post would be vastly more interesting if you confine your analysis to the summer months – as did Hansen, Sato & Ruedy. Does an increase in summer temperature variability occur?

    • Michael Sweet

      Hansen claims that the standard deviation is less during the summer than during the rest of the year. Wouldn’t you expect that averaging the entire year , as you have done, might remove the signal that Hansen detected in the summer only data? It seems to me that you are comparing apples and oranges, especially since Hansen used the entire northern hemisphere data and you used only the US.

      I am interested in your response. Your data analysis and discussion always are interesting to read and I learn a lot from your posts.

      [Response: Note that I re-scaled the data by the standard deviation for its individual month, thereby countering the effect of different variance during different months. Also, I didn’t average the entire year, I simply used all 12 months of each year. Still, year-round (rather than seasonal) data may indeed mask a change which is specific to one season (summer). Perhaps it’s comparing “granny smith” to “red delicious.”

      The fact remains that when different regions warm differently, the variance of the data will be the sum of the local variances and the inter-regional variance. I think the issue deserves further investigation.]

  10. Dick Veldkamp

    Thank you for a very interesting post!

    I have two questions:
    1. I may be mistaken, but it seems to me you are not doing exactly the same as James Hansen et al. You seem to be looking at the average 1-month temperature for many locations, while James Hansen is using the summer 3-month average? Would that change results?
    2. Did you get a reaction from James Hansen about this?

  11. In Hansen et al, fig 4 middle column uses “detrended sigma” — sigma after detrending the data. So they’re at least trying to avoid the problem you’ve identified. The three columns of fig 4 are largely just rescaled, as you mentioned we should expect.

    Your curves seem so perfectly identical to each other I fear you’ve accidentally defined all variability away.

  12. Halldór Björnsson

    If the enhanced spread in figure 9 was due to an overall increase in variability at most locations, then maps of the standard deviation for different periods should reveal that. Figure 2 in the paper shows such maps (though not for the same periods as figure 9). It would be nice to have more details, but I don’t see an obvious increase.

  13. If increased variability arises from combining distrbutions that have warmed differently can we not argue the reverse and state that increased variability should be seen for different cooling trends. Using the most recent baseline and reversing time (effectively causing cooling trends) should by your argument have resulted in a broad distribution for the 1950 decade. But it doesn’t. Any idea why this is the case?

    I think your explanation is symmetric w.r.t time so does not explain Hansens fig 9 right. I don’t quite follow Hansens explanation but I think it not symmetric w.r.t. time hence could explain fig 9 right.

    • MDenison: I think you would be correct if the baseline period were a period at the end of the series when temperatures were stable – the inverse of the first graph. However, the baseline period for the odd graph is 1981-2010 – spanning the most rapid geographical changes – see

      Against that background, I would expect the curve for the middle of the baseline period – 1991-2000 – to show least variation. Instead it shows most. That’s very odd.

      However that period includes Pinatubo and an extreme El Nino. That may explain the higher variability. In which case baselining on 1999-2011 (I know, only 12 years) and comparing 2001-2010 to the 60’s or 70’s may provide weak evidence in support.

      I’ve got all the code to try this out, but no time at the moment. Frustrated.

  14. I agree with Rob Painting’s comment and would be interested to see your analysis:

    “This post would be vastly more interesting if you confine your analysis to the summer months – as did Hansen, Sato & Ruedy. Does an increase in summer temperature variability occur?”

  15. muoncounter

    Schar et al (2004) found that increased variability was necessary to explain anomalous summer heat – in their study, the 2003 European heat wave. Further, they noted that increased temperature variability was a predictable outcome of increased GHG concentration.
    We find that an event like that of summer 2003 is statistically extremely unlikely, even when the observed warming is taken into account. We propose that a regime with an increased variability of temperatures (in addition to increases in mean temperature) may be able to account for summer 2003. To test this proposal, we simulate possible future European climate with a regional climate model in a scenario with increased atmospheric greenhouse-gas concentrations, and find that temperature variability increases by up to 100%, with maximum changes in central and eastern Europe.
    But winters may catch up soon enough.

  16. Rattus Norvegicus

    Thought you might be interested in the Hansen paper that the Bunny highlighted over the weekend:

    Click to access 1205276109.full.pdf

    It relates directly to the subject of this post and the next. Take a look at figure 4.