I was wrong

Not long ago I posted about results from Prashant Sardeshmukh of a study he and others did regarding the frequency of hot days, based on reanalysis data.

I considered the results so implausible that I concluded they had made a “rookie mistake” in using a different cutoff limit for the two time spans they compared, a mistake which would completely invalidate their results.

I haven’t seen the data, but I’ve heard enough testimony from credible sources that Sardeshmukh et al. are both honest and competent. I am now convinced, the mistake was mine.

I was wrong.

I’ll also take this opportunity to apologize for any and all unnecessarily snarky content. I’m OK with being snarky on one’s own blog, but if you’re going to do that then get the facts right first. I didn’t.

45 responses to “I was wrong

  1. Martin Smith

    Do you know what your error was yet?

    [Response: The stated result was extremely implausible. But not impossible. I shouldn’t have made the leap from “extremely implausible” to “wrong” without first acquiring the data and finding out. It’s not like I’m unable to do that.]

  2. Bravo, Tamino, for being forthright about this!

    And thank you for continuing to do what you do.

    [Response: In my opinion, when someone admits a mistake, based on assuming without first investigating the data, that’s not the time to praise their forthrightness.]

    • arch stanton

      In my opinion, anytime a person admits a mistake and/or apologizes for unjustified snarkeyness after careful consideration (one’s own blog or not) praise of their forthrightness is justified.

      Keep up the good work Tamino. Thanks for the years of helping me with my statistics homework, even if it was too late to help my grades.

      • Bernard from Australia

        I’ll second that – honest apologies without accompanying “…but…” clauses are rare enough that they should be treasured for what they are when encountered.

      • W Scott Lincoln

        How often do we see this kind of forthright-ness from some of the other climate science bloggers? You really don’t. This is how you tell the honest ones from the fake ones.

  3. Even the smartest and most honest can get caught up in lines of inquiry that are stupid.

    While it may be possible to find normal distributions of temperature in a time of global warming, I expect modern distributions of temperature to be fat tailed right — as in, we see more record high temperatures than record low temperatures. We see record precipitation events, but very few events of sustained exceptional cold. These days our “cold waves” are more about snow then cold.

    Sardeshmukh et al’s analysis is at odds with the number of recorded heat waves and recent record high temperatures.

    I expect that your initial instincts about problems with the reanalysis data will prove correct. It suggests that working with reanalysis data requires a careful review of assumptions.

    I had a chemistry professor that began every exam question, “Drawing on your knowledge of the world and chemistry, explain in excruciating detail. . . . . . ” . His point was that all of chemistry (and physics) must be considered in the context of a defined and bounded system. In this case, there are
    several big black boxes in this system. Thus, the system is not defined and is not bounded. As long as they remain black boxes, we cannot be sure of what is going on.

    • That’s not at all what I would expect from ‘fat tailed right’, nor would I expect that from global warming.

      I would expect the distribution to become narrower with global warming, what with idea of the largest effects being warmer winters and nights. I would not have expected the rate of heat waves to not increase. Or maybe I just don’t understand the point, as I’m just a spectator here.

  4. Does this mean there will be no effort to check with the same data?

    I still find the results “extremely implausible” myself and was looking forward to a check with the same data. I almost did a partial check myself but found that their source data is not available in a convenient format.

  5. The person who doesn’t make mistakes or admits to them will never learn anything.

    I’m sure someone must have said that once but if they haven’t, I’ll say it now.

  6. Do you still think there is likely a problem with the data used by Sardeshmukh et al (and hence with their conclusions), as suggested in your subsequent post (‘Actually..’)?

  7. Well done Tamino. Admitting to mistakes is a sign of strength not of weakness.

    tonyb

  8. I’m wrong every day, or so my wife tells me. Not to worry. :-)

    [edit]

    [Response: I agree with the rest of your comment, but this is not the time or place to criticize others.]

  9. It’s both important and ethical to admit error, apologize (if the words or tone used call for it), and get back to focusing on the problems at hand. This applies for fields outside of science, of course.

    [edit]

    [Response: I agree with the rest of your comment. But this is not the time or place for that discussion.]

  10. Everybody goofs sometimes. It’s best to apologize and move on, as you have done. Very good.

  11. Steven Mosher

    Good job. well done

  12. Horatio Algeranon

    I was wrong…once. :-)

  13. Martin Smith

    But is Prashant Sardeshmukh sending his data? Your explanation made sense to me, so I still wonder how he computed a 0.001 probability and you computed 0.014. If it’s just a math error, fine. Everybody does that.

    [Response: He is by no means obligated to do the work required to satisfy my data request, just to satisfy a critic.]

  14. Kind of been expecting a post like this. Prashant is one of the smartest people I’ve ever met in this field – the type of person that gives you imposter syndrome. I was pretty skeptical that he’d make such a simple mistake. Thanks for the update.

  15. Kudos. It still seems extremely implausible that increasing the mean and standard deviation somehow wouldn’t increase the probability of events above a fixed temperature threshold. If true, that seems to imply a bizarre change in the shape of the distribution which isn’t being captured by the first two statistical moments: mean and variance. It seems odd that 3rd order effects could consistently dominate 1st and 2nd order effects. Or am I misunderstanding something?

    [Response: I think you’ve got it. But in a global sample, the chances of finding at least a few locations with such unexpected behavior is much greater than for a single location. I’ve emphasized that often enough; I should have heeded my own advice.]

  16. i live in the Chicagoland area. We have had several of the “warmest” months/years/decades “ever” in the past 10-15 years. Yet, we rarely set daily high or daily high average temperature data recently (over about 140 years.) Surprisingly, we have (by memory here – somebody will probably prove me wrong, too) more “lowest” daily average temperatures throughout the year than “highest” average daily temps over this period. Granted, 140+ years is a small universe to generalize about and we measure our regional temperatures officially at a point within 15 miles of our natural “air conditioner” (Lake Michigan) but I am not surprised by the result. For one thing, we experience a continental climate most of the time (except when the wind is blowing “in” at Wrigley Field in the baseball season). This means winds from about 320 – 120 degrees are likely responsible for cooler warm weather “LDA” and warmer “HDA” in cooler weather – unless Lake Michigan freezes over in the wintertime at Chicago (like, the past two winters…) which were also among the 10 warmest years here.

    If you are still with me, local site conditions for measurement of temps are probably responsible for some of the differences noted (the historic temp measurement point for Chicago has moved inland over the period of record). And, most historic points of recording of local weather phenomena are located near large bodies of water (navigation needs dictated the necessity of observation points in ports and coastal regions over time and these historic sites tend to have a much longer record of observations that airport-based sites used more recently). These local conditions bias the observations in favor of fewer HDA and LDA temperatures – as I had hammered into me as a graduate student in environmental science many years ago.

    Could the reanalysis of the historical record missed this (or many other…) subtle local site biases in the result? I’d bet an MS or PhD research topic (or two) against a brute-force statistical reanalysis of the entire record any day…

    [Response: This is a meaningful discussion of the subject, which deserves more meaningful discussion. But that’s not the purpose of this post.

    Rest assured, there’ll be plenty of opportunities for such discussion in future posts.]

  17. All credit for the admission of error. I also thought that it had to be a mistake, so I am genuinely interested in more detail. I’m still confused as to how the mean and standard deviation can increase without increasing the probability of extreme events; not because I think it can’t be the case, but because I’d like to understand how it can be the case.

    [Response: Me too. But such is not the purpose of this, but of future posts.]

  18. Tamino, you are still a class act.

  19. I attended a talk he gave recently that’s similar to what you’re discussing. One of his biggest emphasis was that atmospheric parameters (850 hPa temperature, 250 mb vorticity, etc.) don’t follow a normal distribution at all but a distinctly heavy-tailed distribution, which changes the tails quite a bit from a normal distribution. He fitted the reanalysis data to Stochastically Generated Skewed (SGS) Distributions (never heard of them before myself, maybe tamino or others here have — he said they’re associated with linear Markov processes). A few in the audience objected to this; saying using extreme value statistics was better. Sardeshmukh’s claim was SGS fitting had the benefit of using all the data. In that talk, he looked at reanalysis data from 1871-2011, comparing the first half of the time period with the second. For daily 850 mb temperatures, he found the standard deviation overall decreased somewhat and the shape changed while the mean increased.

    Hard to say much more, talks go fast. I’m curious for more detail as well.

    • arch stanton

      >>He fitted the reanalysis data to Stochastically Generated Skewed (SGS) Distributions (never heard of them before myself, maybe tamino or others here have — he said they’re associated with linear Markov processes)..<<

      +1 question imho

      Thanks for the help

    • Neil wrote:

      For daily 850 mb temperatures, he found the standard deviation overall decreased somewhat and the shape changed while the mean increased.

      One thought that occurs to me is that heat waves are made more extreme when soil is dried out, and this is something that we should expect at higher temperatures. If this is the case, then there will be less moist air convection. This will have the effect of insulating to some extent higher altitudes from the heat waves below.

      I still think that it is a mistake to focus on 850 mb as opposed to the surface, to use reanalysis data when direct measurements at the surface are available, and when studying heat waves, to focus land plus ocean or on what corresponds to Northern Hemisphere winters. The Northern Hemisphere has the most land, and given the ocean’s thermal inertia, it has warmed the most. The positive feedback that results from soil drying out will be limited to the drier, hotter months.

      I realize Sardeshmukh is widely cited and well-respected, but even for someone with my quite limited background, his approach seems highly problematic.

      • Bernard from Australia

        I still think that it is a mistake to focus on 850 mb as opposed to the surface

        I’m going to go out on a limb of ignorance here, and suggest that’s not the case if the point of the study is to look at atmospheric dynamics, rather than surface conditions.

        Not knowing any more details of the study or it’s purpose, I await Tamino’s further articles on the topic. My scientific curiosity has been awakened!

      • I wouldn’t necessarily expect an increase of soil drying out.The reverse could happen; wetter summers could cancel the effect of an increasing mean, leading to no increase in heat waves. I think that might be the case for the Northeast US.

        And I think the interest is in atmospheric dynamics as well.

        Is the discrepancy with changes in standard deviation and mean not matching his result only in a few spots of the globe or lots of spots?

      • Horatio Algeranon

        It might be the case that there are processes happening at 850 mb (eg, low level jet streams) that impact the 850mb reanalysis data distribution more significantly than thermometric data at the surface.

        I don’t know anything about this stuff, but would assume that Sardeshmukh does.

      • Horatio Algeranon

        There is also apparently low level jet stream (~1.5km high which is right in the ballpark of the 850mb ) over the Indian ocean. The fact that it is most pronounced in summer months during the monsoon may be why Sardeshmukh chose the dead of winter for his analysis months (rather than summer when the stream would be much more pronounced)

        It is known that jet streams can have a pronounced effect on temperature and pressure so it’s not unreasonable to assume that they might affect the distribution (specifically skew and kurtosis)

      • Neil writes:

        I wouldn’t necessarily expect an increase of soil drying out.The reverse could happen; wetter summers could cancel the effect of an increasing mean, leading to no increase in heat waves. I think that might be the case for the Northeast US.

        And I think the interest is in atmospheric dynamics as well.

        I don’t know the specifics of the Northeast US. However, as I understand it, land warms and cools more quickly than water. It has less thermal inertia. And this is true not simply of global warming but throughout the year. Consequently land will cool more quickly during the winter, warm more quickly during the summer.

        With more moisture in the atmosphere as the result of global warming, one may expect increased precipitation to fall during the winter, less precipitation to fall during the summer – the latter due to the drop in relative humidity as moist maritime air warms while moving inland. What precipitation does fall will be more concentrated along the coastlines. Furthermore, when rain falls there will tend to be fewer although more intense events. Thus you should expect intense flooding.

        Given the higher temperatures during the Spring, places that rely upon snowpacks during the dry, summer months will tend to dry out since those snowpacks will melt out early. Furthermore, even what moisture reaches a place during the hotter months will evaporate more quickly, and simultaneously, there will be greater demand for water by plants and animals due to the increase in summer temperature resulting in more evaporation from skin and during respiration by plants. Finally, precipitation will tend to be more concentrated along along the coasts. Continental interiors will tend to dry out.

        Now I am not saying that these are rules that always apply. However, I believe these are general rules of thumb and largely a matter of basic thermodynamics. Furthermore, assuming land dries out, as one would tend to expect during a heat wave, this will result in less moist air convection, and as a consequence, less heat will be lost at the surface and transferred to higher altitudes.

        Given all this, if one wishes to study heat waves when and where heat waves will show a clearer trend, you should typically study them in the northern hemisphere (which has more land), during the northern hemisphere’s summer months, and at the surface. Especially if you wish to see a trend when it first begins to rise above the noise. This will be particularly important if you are focusing on identifying a trend in the rarer extreme weather events.

        Neil writes:

        Is the discrepancy with changes in standard deviation and mean not matching his result only in a few spots of the globe or lots of spots?

        I don’t know. I had assumed that he was studying something of more general interest than a few spots, assuming he was trying to demonstrate no increased tendency towards more extreme heat waves,

    • Horatio Algeranon

      That’s interesting..but raises another question.
      Does ground thermometer data show a similar skewing of the distribution?

  20. Best title of a blogpost in the history of the intertubes, ever.

  21. Ah, well, it can happen to the best of us. You are certainly not diminished by this in my eyes, Tamino.

  22. I understand things better with a good picture: is there any chance of having this result illustrated graphically? Something a bit like the IPCC figures showing consequences of change of mean and or std deviation for a gaussian; but in this case showing the changes in shape as well. That is; I’d like to see a graph of the two distributions, with mean, standard deviation and probability of exceeding the fixed threshhold value all shown. If I understand this, for this study there’s a second distribution with different shape, greater mean, greater standard distribution and yet reduced area in the region above the chosen threshhold.

  23. The right thing to do… and therefore appreciated.

    For the record.

  24. 1) I know, probably not the place to ask, and
    2) forgive my ignorance but this maybe relevant, not to Tamino’s gracious apology but to this business of extremes..

    A couple of years while ago i looked at the frequency of daily maxima in individual states and territories (there at 6 states and 2 territories) in instrumental data for Australia over the past 100 years.

    I noticed an upward trend in the frequency of maxima above arbitrary levels (eg 30C, 35C, anything that was much more distinct in the south (eg Tasmania) but much less distinct in the tropical north, eg Northern Territory.

    Not being a scientist, I have no idea why this might be, or whether it’s a universal phenomenon. (I couldn’t find any reference to it anywhere.)

    Could it be that there’s something that suppresses extreme daily maxima as the mean rises, say increased cloudiness,or energy being used for evaporation, or generating turbulence rather than heating the actual air, or heat being transported more efficiently upward in hotter climates?

  25. Hey Tamino,

    You did what you had to do, everyone involved seemed to handle it all well . We acknowledge Dr.Sardeshmukh’s cool response to the criticism posed at the Climate Etc site. so its in the past now.

    Looking forward I would say a lot of good has come from your attempt to reverse engineer Sardeshmukh’s results. In particular I found the non-parametric procedure you gave in “Actually” using thermometer data for calculating the number of hot days to be very useful and informative and has lead to interesting variations which I have tried on both ECA&D and USHCN station data. Also there are still a number of questions and implications from the “Something Ain’t Right” and “Actually” posts that need to be addressed and I look forward to discussions on those points.

    Finally and most importantly what gives this blog its character and vitality is that it never backs down from addressing or posing the difficult questions in climate science statistics. Don’t let this incident do anything to change that.

  26. Now, if only other commenters on climate science could be as open, honest and gracious.

    And yes, I’m talking about some who occasionally drive by here.

  27. It happens. We all be warned by this for certain kinds of ‘overenthusiasm’ wrt hot subjects.