Extreme Heat

Climate Central has an interesting post about the extreme heat wave in Moscow this last July. They point out that if we assume the data are normally distributed, then the July 2010 average temperature anomaly value was more than 4 standard deviations above the July mean (and they have a lovely graph to emphasize it):

What’s the chance of such a deviation from the norm? For a normally distributed variable, they say, “This probability turns out to be on the order of a one and a half chance in 100,000 for the July anomaly.”

They only used data since 1950, but data are available for more than a century. Here’s the monthly anomaly data for Moscow (anomaly relative to the entire time span), for all months (not just July), together with 5-year averages (in red):

The extremely hot July is the spike furthest to the right. But it doesn’t seem to be exceptional at all. The appearance of not-that-extreme is because it happened in July, and the natural variation in July is less than the average throughout the year. This is clearly visible in a plot of temperature anomaly as a function of month:

It’s also visible in a boxplot by month, which shows that last July’s anomaly (the highest “dot” on the graph) was an outlier, but wouldn’t have been in a winter month:

The winter months show considerably more natural variation than the summer months, so an extreme value in July is more deviant from “normal” than the same extreme value in February. So, let’s look at last July’s Moscow temperature compared to previous Julys:

Clearly last July’s Moscow heat (on the far right, circled in red) was exceptional. But was it a one-and-a-half-in-100000-years event?

Do Moscow July temperatures follow the normal distribution? If so, then a “quantile-quantile plot” (QQ-plot) should roughly follow a straight line — but it doesn’t:

There’s an upward bend on the right, indicating that extreme high temperatures are more common than with a normal distribution. Also, the Shapiro-Wilk test for normality resoundingly rejects the normal distribution. Even if we omit this last July, the bend in the QQ-plot is still there and the Shapiro-Wilk test still rejects the normal distribution.

Also, higher moments of the distribution indicate positive excess kurtosis — which can (but doesn’t necessarily) indicate a heavy-tailed distribution with more chance of extreme values — as well as positive skewness — which indicates that the right-hand tail (extreme highs rather than extreme lows) is more prominent than the left-hand tail. Both of which increase the chances of such an extremely hot July.

So the normal-distribution assumption is right out.

We’re really interested in the distribution of extreme values for July temperature. We can apply extreme value theory to approximate that distribution. Fundamental theorems in exreme value theory reveal that in well-behaved cases (which we expect) the distribution of extreme values will follow one of a small number of possible distributions. This is the extreme-value analagy of the central limit theorem.

In fact there are three possible limiting distributions for extreme values. Which one applies to a given situation depends on how the cumulative distribution function (cdf) approaches 1 as data values go to infinity. The cdf is the probability that a value will be less than or equal to a given value x

F(x) = Prob(X \le x).

As x increases to infinity, the cdf must approach its limiting value of 1 because the probability of a data value being less than or equal to infinity, is 1 (i.e., certainty).

A convenient way to study the asymptotic behavior of the cdf is by examining what’s called the survival function, which is just 1 minus the cdf

S(x) = 1-F(x).

The distribution function F(x) asymptotically approaches one as the data value increases, so the survival function asymptotically approaches zero. It turns out that under most circumstances there are a limited number of ways to do that.

I fit an extreme-value distribution to these data, and being conservative I find that the recent July heat wave is hardly a one-and-a-half-in-100000-years event, it’s really only a 1-in-260-years event. So in terms of its extreme deviation from “normal” July temperature, it’s not the extraordinary event some have suggested.

One of the most interesting facts is that, if not for global warming, this would have been an extraordinary July. That’s because global warming has increased the mean July temperature in Moscow, so a given deviation above the mean corresponds to a hotter temperature. Without global warming, this once-in-a-century-or-two event would have been closer to a once-in-a-millenium event.

Estimating extreme value distributions with only a little more than a century of data is imprecise; with little data, we have even less extreme data on which to base a model. In spit of this limitation, we can get a decent rough idea of the likelihood of extreme events. And the bottom line is that every degree Celsius increase in mean July temperature in Moscow, roughly doubles the chances of any given extreme heat wave. In fact Moscow temperature has increased as much as 3 deg.C since the early 20th century, and according to the extreme-value approximation model I computed, this makes a given extreme 8 times more likely than before.

Without global warming, Moscow’s July 2010 would have been one for the history books. As global warming drives average temperatures even higher, present citizens of Moscow are likely to see multiple such events in a single lifetime. Which is scary.


34 responses to “Extreme Heat

  1. Tamino, you should have the numbers handy, so

    Assuming a linear temperature increase for Moscow from 1975, what will be the likelihood (in the form of “1 in X” where X is 260 for 2010) of the July temp being repeated in 2020, 2050 and 2100?


    [Response: Rough estimates using the linear rate for Moscow July since 1975: in 2020 it’s 1 in 190, in 2050 it’s 1 in 75, in 2100 it’s 1 in 15 (likely enough that the extreme-value limit may no longer apply). If the warming is nonlinear, in particular if warming accelerates … it’s scary.]

  2. I enjoyed this but I am disappointed that you stopped “showing your work” when you got to the GEV analysis, which for me would be the most interesting part.

    Would you be able to make a version of your “normal Q-Q plot” based on the Gumbel reduced variate instead of the normal variate?

    [Response: I started intending to “show all work,” but the length of the post became as extreme as a Moscow July — and when I showed it to my wife she rolled her eyes and told me I’d lost her …

    For the Gumbel case, the extreme-value distribution is Gumbel but the high-value limit of the survival function is exponential. So if you make an exponential Q-Q plot the right-hand tail will be approximately linear. In fact the exponential Q-Q plot is an excellent exploratory tool for extreme-value behavior.]

  3. Philippe Chantreau

    Very interesting and informative. Thanks.

  4. Timothy Chase

    You can’t really “remove” the global warming signal from the Normal Q-Q plot (I would presume – then again, with you I wouldn’t be too surprised), but what happens to the plot if you remove the last 30 years, or in other words, where do the last 30-5 years (the modern period of global warming) show up in the distribution? Is it closer to a straight line if you remove the influence of the modern period upon the distribution?

    Alternatively, what happens if you first break up Moscow’s trend in global warming into linear segments then subtract the linear trend value from the actual values then recalculate the Normal Q-Q plot? Would this be any closer to a straight line? And if not, can you still say how probable July’s heatwave would have been in the absence of global warming?

    One other question: is 1/f noise a better model for Moscow’s temperature record?

    [Response: Actually I tried removing the approximate signal from the July-only data by using a lowess smooth, then studying the residuals. The bend in the normal QQ-plot persists, and the normal distribution is simply not applicable for extreme values.

    I haven’t examined the entire record, but for July-only the autocorrelation isn’t very strong, so a 1/f noise model doesn’t seem appropriate.]

  5. Jonathan Gilligan

    A related question, which I don’t know how to think about clearly: If we were talking about phenomena with very little spatial correlation, you’d also have to ask about framing the question correctly regarding what’s the sample from which you’re drawing: we’re only looking at Moscow ’cause that’s where the extreme temperature occurred, so should we be asking “what’s the recurrence time of this kind of temperature anomaly in Moscow” or “what’s the recurrence time of this kind of temperature anomaly anywhere in the world.”

    If we were looking at lotteries or something like that which is truly independent from once city to another, I’d know how to ask the question properly, but the heat wave was really part of a synoptic-scale event so I don’t know how to do it here. Any thoughts or pointers?

    [Response: This analysis is for Moscow only, for July only. Different locations (and different times of year) may have different warming trend rates and different extreme-value behavior, so these results don’t generalize to other locations around the globe.]

  6. Jonathan Gilligan

    Correction: I just asked, “what’s the sample from which you’re drawing?” I meant “what’s the population from which you’re drawing?”

  7. Tamino,
    Just wondering if you’ve looked at a Lognormal as a candidate. This would indeed be skewed right. In fact, any well known distribution with the possible exception of a Weibull or Normal would be skewed right. I also wonder how you feel about doing such an analysis with a nonstationary mean. It might be interesting to look at how standard deviation scales with temperature for bins of time.

    However, I applaud you for emphasizing that merely assuming Normality can lead to very misleading results. It also puts paid to the slander leveled by microWatts et al. that statistical analysis always exaggerates warming.

    [Response: Lognormal is certainly heavy-tailed (right). But its tail is “heavier-than-exponential” (HTE), whereas the data clearly indicate the tail of the Moscow July temperature isn’t. So I think a lognormal wouldn’t be appropriate.]

  8. typo: “Fundamental theorems in exreme value theory reveal “

  9. Thanks, Tamino. Quite clear.

    A slight additional typo in the second-last paragraph: “In spit of this limitation. . .”

  10. Well, I guess this event is significant, if it is happening at a large number of other places also.
    You cat just cherry pick an extreme event and draw broad conclusions from it.
    The chance of winning a lottery may be one in ten million, but that does not mean that the person who wins has magical powers.
    According to normal distribution, no-one should win the lottery, still someone always does..

    [Response: Don’t be silly. A helluva lot of people play the lottery, so the chance of *someone* winning is way more than one in ten million.

    Nor did I draw a broad conclusion from an extreme event. I stated that this extreme event is what we expect to become far more common due to global warming — which is supported by a vast array of evidence. In fact such extremes have *already* become far more likely, due to global warming.]

    • björn — if you read for meaning beyond the first few paragraphs you might discover that Tamino points out that even in the absence of global warming this sort of event is actually much more common than people thought.

      Earlier people were saying 1 in 1000, then a more detailed analysis that nevertheless assumed a normal distribution came up with 1.5 in 100,000. To this Tamino says in essence, “Nah… more like 1 in 260. But given the global warming that we know is taking place these sorts of things are going to be much more common in the future.” If you problems with that let me suggest that you dial 911 and tell them to hurry and to be sure that they bring the jaws-of-life.

    • “According to normal distribution, no-one should win the lottery, still someone always does..”
      EPIC FAIL.

      “Well, I guess this event is significant, if it is happening at a large number of other places also.”
      Remember the 2009 Australian heatwave? The 2003 European heatwave? The 2007 Asian heatwave? And those are just a few of the really serious ones. There have been a huge number of heatwaves already this century.

    • “Response: Don’t be silly. A helluva lot of people play the lottery, so the chance of *someone* winning is way more than one in ten million.”

      As others have asked what is the size of the population that this heat-wave is drawn from? This is highly significant to what conclusions can be made.
      Is the population:
      1 – Moscow in July 2010,
      100 – Moscow in July last 100 years
      1200 – Moscow in any month last 100 years
      120000 – 100 largest cities in any month last 100 years

      Post ante selection of data and statistical analysis is fraught with danger. One of the main problems is “Texas Sharp-Shooting”

      [Response: If you want to know what sample is the basis for this analysis, re-read the post. It’s clear.

      For those who are interested, July 2010 wasn’t just the most extreme hot July in the over-a-century Moscow record, when anomalies are scaled by their month’s typical variation it’s the most extreme hot *month* in the century or more, by a large margin.

      Your implication about the sample size required to draw conclusions is mistaken. One hardly needs 100,000 data points to deduce that certain behavior is approximately 1-in-100,000 — the whole point of extreme value theory is to make inferences about rare events without requiring an obscenely large sample.

      And your reference to the “Texas sharp-shooter” fallacy makes no sense at all.]

      • I must have not made myself clear. The populations size matters for drawing any other conclusions other than estimating the probability of the event occuring. The population size matters for drawing conclusions from the “1 in 260 years” estimation not in making this estimation.

        To continue with the lottery analogy it is like looking at the winner and drawing any other conclusion other than that particular person had a 1 in 10 million chance of winning. That the estimated probability is so extreme in this case is unsuprising because we have selected the sample after the fact from a huge population (“A helluva lot of people”).

        For an example of the Texas Sharpshhoter fallacy look to your first graph that highlights the Years 1972 and 2001. This is very similar to the common example given for this fallacy – circling cancer clusters, which are usually just artifacts of random data, after the fact.

      • Pete, what you are ignoring is that while we are looking at data for Moscow itself, this is actually a proxy for a much larger region that expreienced an extreme heat wave. Even of you gridded the globe in blocks that size, the probability of such an event is quite low. And in any case, Tamino’s point is the change in the probability, not it’s absolute value.

      • How is highlighting the top outliers an example of the Texas sharpshooter fallacy? They are just outliers. They are clear before or after “the fact”, whatever that is.

        Tamino then goes on to show that they are not “just artifacts of random data”.

        Of course the scope is relevant. But Tamino made it perfectly clear what he was talking about. If you want to extrapolate, you will have a lot of work to do first.

        Don’t let us stop you. There’s plenty of other heatwave data available. Expand Tamino’s exercise, and tell us how heatwaves are changing globally.

      • “How is highlighting the top outliers an example of the Texas sharpshooter fallacy? They are just outliers. They are clear before or after “the fact”, whatever that is.”

        The problem is deciding which outliers to highlight occured after looking at the data. Why choose 3 outliers to highlight, why not 2 or 4 or 5 or 10? Would the outliers have been highlighted if they were from early in the 20th century? Why not highlight left tail outliers? Sure there might be reasonable answers to any or all of these answers but they will just be post-hoc rationalizations.

        [Response: As for why to highlight years other than 2010, you’ll have to ask Climate Central — it’s their graph. I used theirs (rather than making a similar one of my own) because I liked their use of colors.]

      • Pete, The Texas sharp-shooter fallacy is essentially a problem of sampling. However, when one is dealing with extreme events, one does not in fact have a choice. They come when they come, and all you can do is try to figure out what they tell you about the distribution. To do that, you need to look at the distribution you think is the parent distribution–or at least the other samples from that distribution. This is what Tamino did.

      • Pete, Tamino didn’t actually use the highlighted years as the basis of his analysis. Your whole complaint is just silly.

        You have no idea when they decided what to highlight. To me, it looks like outliers above 2 sigma (and below -2 sigma [empty set]). Nothing exciting about that, is there?

        But no matter what you think, the whole issue is uninteresting and irrelevant, so please drop it already.

    • Gavin's Pussycat

      Actually it’s not just Moscow… it was most of European Russia (and Finland was along too I can tell you). Some would even include Pakistan…

  11. So if I have this right if Moscow warms up another 3 degrees the frequency becomes roughly a once-in-15 year event. That’s not good. And the once-in-50 year events at that point don’t bear thinking about either.

    [Response: I based my projection on the warming rate for Moscow July temperature since 1975 (as per the question), which is 7.6 deg.C/century. But there’s large uncertainty in that figure. Even so, it’s not gonna be good.]

  12. Sorry about the “Earth” sig above — had in mind a joke of sorts, not here though.

  13. What would happen if you use this on say the cold snap they had in south america just a thought to even out the over all temp

    [Response: I never heard any South American meteorological officials describe the cold snap as a “once in a thousand years event,” nor have I seen any analysis indicating it’s a 1.5-in-100,000-years event. Do you have data indicating such conditions, or did you mention a “cold snap” just because you want to be contrary?

    “Cold snap” happens all the time — just because climate changes doesn’t mean we won’t still have weather. My guess: the South American “cold snap” was nothing at all like the extremity of the Moscow heat wave.

    Please no “snappy retort.” If you have data, put up — otherwise shut up.]

  14. I think this may be to what he’s referring:


    [Response: With such choice phrases as:

    In the Chilean district of Aysen, the snowstorm was said to be worst in 30 years

    Mendoza, a region known for its wine, not snow, had snow said to be the heaviest in a decade.

    one town of Tucuman had snow for the first time since 1921.

    On July 16, the cold blast brought Buenos Aires its lowest temperature, -1.5 C, or 29 F, since 1991.

    It sounds like it’s not even close to as extreme as the Moscow heat wave.]

  15. It’s interesting that there was an anomaly in terms of circulation:

    “The cold front that ushered the cold northward did something few such fronts ever do: it slipped across the Equator with a noticeable drop in temperature in southeastern Colombia and northwestern Brazil before dissipating amidst tropical warmth.”

    In terms of the cherry pick involved, at the time of the cold outbreak this was one of the few spots on the globe where you could find a cold anomaly, IIRC.

  16. Tamino, I apologise if you’ve covered this or if it’s come up in another post, but I am fatigued after a night arguing with HIV denialists and my eyes and brain won’t behave themselves!

    I’m interested in the sharp angle in the Q-Q plot. If you look at the points above the bend, is there any temporal pattern to their occurence, or do they occur randomly throughout the dataset?

  17. Really great post, Tamino, thanks.

  18. Steve O'Connor

    Really nice and clear, thank you.
    I was looking for one of your older posts to reread (‘how long’) but can’t find it. Will it be reposted in the future?

  19. Hi Tamino,

    sorry for off-topic. Are you aware of update data on Antarctic mass balance? There is update on Greenland mass balance (through august 2010) on Sceptical Science. I passed it to Jim Hansen and he would be interested in update of Antarctica,


  20. Thanks, very useful post.

    In the beginning you showed all the steps, but how exactly did you derive the probability of occurrence in the end?

    [Response: The limiting distribution of values over a high threshold will approach a “generalized Pareto distribution”. By using values over a selected threshold, one can estimate the parameters of that distribution.]

  21. Fielding Mellish

    “In the Chilean district of Aysen, the snowstorm was said to be worst in 30 years…Mendoza, a region known for its wine, not snow, had snow said to be the heaviest in a decade…one town of Tucuman had snow for the first time since 1921.”

    This sleight-of-hand reporting seems most of the time to dupe the weather knowledge-impaired into inferring a deep freeze. HEY SHEEPLE–SNOW IS NOT THE SAME AS BITTER COLD! I heard the same ignorance in relation to the unusual snow in north Texas in early 2010. For as long as temperature records have existed there, the region has experienced sufficiently cold winter temperatures to support precipitation in a solid state. What they usually don’t get is the confluence of humidity and barometric pressure conditions to match. Nonetheless, to the average dolt, white means “arctic,” and coincidentally the total absence of AGW–which probably explains how the petrolocrats came to rule the state. “SAID to be the worst?” That weighty evidence leads me to remark that every single lingerie model I ever dated said I was the best thing walking the planet. Nevermind that my dog can count to that lofty number, hehe.

  22. The driver of the event was the jet stream. The jet streams is powered by a heat engine that uses the Arctic as a heat sink. The Arctic heat sink got a redesign and rebuild from 1997 to 2007.

    It has taken a couple of years for the effects of that rebuild to work through the system, but we can see it in the July, 2010 temps for Moscow. Other special effects were (3) days (in 3 different months) when Nuuk, Greenland was warmer than San Francisco, CA and NYC got extra snow.

    I would say that under current Arctic sea ice/jet stream conditions, it is as likely for Russia to get two consecutive extra hot years as it is for NYC to get two consecutive extra snowy years. Climate statistics is not much good when the sea ice is changing and forcing that the whole NH circulation pattern .

    Welcome to a new chapter in Earth’s climate. It will be short chapter. We are about ready for another rebuild of the Arctic heat sink, and then we can start another chapter. I grew up in Cheyenne, Wyoming, and we had changeable weather, but nothing like what I expect to see in future.

  23. Pingback: NOAA: Monster crop-destroying Russian heat wave to be once-in-a-decade event by 2060s (or sooner) – Exclusive: NCAR’s Trenberth challenges the attribution analysis, “Many statements are not justified and are actually irresponsible.”

  24. Nice new paper on the 2010 heat wave…

    “The Hot Summer of 2010: Redrawing the Temperature Record Map of Europe”

    “The summer of 2010 was exceptionally warm in eastern Europe and large parts of Russia. We provide evidence that the anomalous 2010 warmth that caused adverse impacts exceeded the amplitude and spatial extent of the previous hottest summer of 2003. ‘Mega-heatwaves’ such as the 2003 and 2010 events broke the 500-yr long seasonal temperature records over approximately 50% of Europe. According to regional multi-model experiments, the probability of a summer experiencing ‘mega- heatwaves’ will increase by a factor of 5 to 10 within the next 40 years. However, the magnitude of the 2010 event was so extreme that despite this increase, the occurrence of an analogue over the same region remains fairly unlikely until the second half of the 21st century.”