Uncertain T

What’s the present trend in global surface air temperature? Good question.


We can estimate it from observed data, but there’s uncertainty in our estimate. Lots of uncertainty. One thing we’ll want to do is use recent data in order to get the recent trend, so let’s use data from some starting time to the present, and apply the best-known and most-used method, linear regression. The more recent the data on which we base our estimate (the later our starting time), the more it will be relevant to what’s happening now. But — the later our starting time, the less data we’ll have to work with so the more uncertain will be our estimate. That’s just the price we pay.

Let’s not cut corners on uncertainty. We’ll use the data from Cowtan & Way, and try all starting times from 1975 through 2005. We’ll estimate the standard error of our regression using the method of Foster & Rahmstorf (2011). We’ll also construct 95% confidence intervals, not using the “plus-or-minus two sigma” approximation which is rooted in the normal distribution, but using the t distribution because as we use smaller and smaller time spans the number of effective degrees of freedom decreases. This will give us larger uncertainty ranges (for 95% confidence intervals), as it should. Although for earlier start times the number of degrees of freedom will be large enough that the normal-distribution and t-distribution methods will be indistinguishably close, for later start times they will diverge.

We can then plot, for each starting time, the estimated rate of warming together with its 95% confidence interval. And here it is:

rate

It should be borne in mind that the “raw statistical uncertainty” as shown by the 95% confidence ranges, are simply what meets the eye — there’s more uncertainty still. For instance, there are a lot of start years tested, so it’s far more than 5% likely that at least one of the given 95% ranges does not include the true value (see this). There is uncertainty in the data itself, both possible bias due to unknown factors and sheer uncertainty in the measured values even apart from bias. Therefore we should look upon the ranges shown in the graph as reliable but uncertain indicators of the true trend.

Yet despite those shortcomings, the given calculation includes enough consideration of uncertainty, treated in sufficiently rigorous fashion, to be a useful guide to what we can reliably say about the present trend in global average surface air temperature.

I would draw two main conclusions from this analysis. First, there really isn’t reliable evidence that the genuine trend (apart from the short-term fluctuations) is different from its value for 1975 to the present. Second, there really isn’t reliable evidence of a nonzero trend since 1997, in a purely statistical sense. These two conclusions might seem to some to be contradictory, but they’re not. Their difference serves to emphasize that for time spans less than about the 30 years which meteorology has settled on empirically, the uncertainty in trend estimates is big enough that we’re not able to distinguish between alternatives in a purely statistical sense.

One could argue that my last sentence isn’t right. After all, linear regression isn’t the last word in trend analysis, and one might make a strong case that a Bayesian approach would distinctly favor one conclusion over the other. But in my opinion, the impressive level of uncertainty in trend analysis over time spans of less than at least three decades, is the real take-home message from careful study of the data.

That may make some people uneasy — that there’s no ironclad smack-down of either claim just from basic analysis of bare numbers. But again, that’s the price we pay for making an honest attempt (albeit a necessarily incomplete one) at actual rigor. It’s truly great that being a statistician allows me to play in everyone’s backyard — but it comes at the price of occasionally having to be the voice of sobriety who throws cold water on others’ heat of passion.

Finally, for those interested in what I consider a good visual portrayal of the situation, here’s my now-“standard” display of the trend from 1975 through the end of 1999, projected into the future, with dashed lines 1 and 2 standard deviations (of the residuals) above and below the projected trend, compared to the data from 2000 to the present:

projection

If that’s what you call a “pause,” then it’s not a very impressive one.

74 responses to “Uncertain T

  1. Does the final graph also use Cowtan and Way?

    [Response: Yes.]

  2. To anyone not taking time to understand what your first graph shows the superficial impression might be of a steady warming trend for the first 20 years then a decreasing trend for the last 15 years. I’d be interested to see what the same graph would look like but with just the first 20 years (trends up to 1995) shown.

  3. Horatio Algeranon

    “Heat of Pausion”
    — by Horatio Algeranon

    Statisticians
    Prone to caution
    Throwing ice
    On heat of pausion

  4. Tamino,
    Some of the uncertainty in the trend estimate from a given start date must be associated with uncertainty about the intercept, right? By ignoring all the data prior to a given start date X, aren’t you missing an opportunity to reduce the uncertainty about the intercept- and therefore the post X trend?
    If we are willing to impose a continuity constraint (i.e. no step changes), then I think we can rule out non-zero trends for start dates up to ~2005.

  5. Halldór Björnsson

    Tamino,
    The second plot is great.
    Could you redo it, with three steps:
    1. The data and trend lines up ot 2000
    2. The continuation of the trend (solid and dashed)
    and
    3. The data since 2000 added (as plot above)

    This could then be shown in sequence in a presentation. Should make a nice visual story.

    [Response: Indeed it would. OK. No specific promise when, but it’ll happen.]

  6. I suppose the trend from the data set where the exogenous factors have been removed (El Nino, TSI, volcanos) is pretty close to the above trend line. Can you perform this same analysis on that data set or would it be inappropriate because it was already manipulated to establish a linear trend?

    [Response: It would be appropriate, and would also be valuable, but this post is specifically about the trend in the basic data.]

  7. I think a particularly revealing demonstration of the failure in logic with respect to the “pause” would be to do a second figure with an identical approach to the first, but this time vary the finishing time rather than the starting time. I suspect that the result would look very close to a mirror image of your existing first figure.

  8. >> confidence area
    > deviations (of the residuals) above and below

    So going toward more recent years, the confidence area spread gets wider but the deviation of the residuals stays the same — how does that work?

    [Response: The confidence interval applies to the trend, which necessarily widens as the number of data on which it’s based shrinks. But the range of individual values is dominated by the variance of the data (including the noise as well as the trend), so I’ve simplified by just plotting constant-sigma intervals.]

  9. I’ve already seen people mentioning a news article talking about “the pause” where it’s no longer “no warming since 1998” but instead “no warming in the last 10 years.” I guess we’ll have to start using 2003 as the cut-off in some of our plots. Then in 5-10 more years it will be “no warming since 2013” or something like that.

    • Most of those are taking the all time record temp as one data point and taking the most recent (and not record breaking, therefore by definition lower) temperature and saying “The temps today are lower than that year!” Ergo, cooling.

      Such deniers will shut up in any year where there is a record, therefore this “trick” (in the fraudulent sense of the word) doesn’t work, but they’ll come back in a few years and do it all over again.

  10. Heres how even a dyed-in-the-wool frequentist can reject the hypothesis of no warming since 1998 (and indeed any year up to 2002).

    yearX=1998 #choose year when warming putatively “ended”
    timec=time-yearX #centre time on that year
    before=ifelse(timec>0,0,timec) #dummy variable for time before yearX
    after=ifelse(timec|t|)
    (Intercept) 0.355422 0.031340 11.341 9.57e-13 ***
    before 0.018065 0.003050 5.924 1.35e-06 ***
    after 0.016071 0.004018 4.000 0.000351 ***

    Yes, the ‘after’ slope is significantly different from zero!

    # Is the slope after yearX significantly different from the pre-yearX slope?
    mod=glm(y~timec+after)
    summary(mod)

    Coefficients:
    Estimate Std. Error t value Pr(>|t|)
    (Intercept) 0.355422 0.031340 11.341 9.57e-13 ***
    timec 0.018065 0.003050 5.924 1.35e-06 ***
    after -0.001994 0.006330 -0.315 0.755

    #No, the post 98 slope is not significantly different from the previous slope!
    #NB the ‘after’ coefficient is now the difference in trends

    logical consistency averted, smack-down achieved!

    • corrections:
      the first model is fit by:

      before=ifelse(timec>0,0,timec)
      after=ifelse(timec<0,0,timec)
      mod=glm(y~before+after)
      summary(mod)

      And the final sentence should 'logical INconsistency averted..'

  11. dikranmarsupial

    Most skeptics don’t seem to really understand statistical hypothesis testing. An important concept that is often misunderstood is that if the trend fails to reach statistical significance and we are not able to reject the null hypothesis (H0), that doesn’t mean that we reject the research hypothesis (H1). This is because there are at least two reasons why the trend fails to reach statistical significance: (i) because H0 actually is true, and (ii) because H0 is false, but there isn’t enough information in the data to show sufficiently unambiguously that it is false.

    A consequence of this is that if we exchange the hypotheses, so that now we are arguing that there has been a change in the rate of warming (H1) then the null hypothesis (H0) is that warming has continued at the same rate. If we choose a short period (e.g. 17 years) it is quite likely that there wouldn’t be enough information to reject that null hypothesis either.

    We are then at the situation of having two hypothesis tests, one of which says we can’t rule out the possibility that there has been a pause in warming, the other says we can’t rule out the possibility that temperatures have continued to rise at the same rate as before. In other words, we just don’t have enough evidence in such a short window to provide solid evidence either way.

    Or of course, you could just look at the confidence interval and see that it easily covers both a zero recent trend (i.e. a “pause”) and a constant long term trend (a Bayesian credible interval would be better for purposes of interpretation, but would probably be numerically similar).

    The real problem is that those with an agenda tend to interpret statistics in the way that supports their argument, while the scientist uses them as a sanity check that is a hurdle for their argument. This is demonstrated by the fact that the skeptics never seem to adopt a test where the null hypothesis (which is implicitly assumed to be correct – the test sees whether this assumption can be rejected) is that warming has stopped.

  12. dikranmarsupial

    oops the last line should of course read “is that warming has continued at the same rate as before”.

  13. Hi Tamino,

    The Premium Bond Probability Calculator (OT)

    I came across this the other day. The reference to “post-doctoral cosmology statistician” made me think of you, or if not, someone you would probably know. Am I on the right track?
    The Premium Bond Probability Calculator

    The Premium Bond Probability Calculator
    To accurately calculate the odds, you need to use something called “multinomial probability”. After all, to work out the chance of someone winning £200 a year, they could win 2 x £100, 8 x £25, 4 x £50, or a host of other variants. This multitude of probabilities means accurate calculation is hellish.
    A few years ago I set myself a challenge to do it. I failed. I got one of my team with a top maths degree to try. He failed. We contacted an LSE Professor of Financial Mathematics she knew how to work it out, but she needed a specialist to do it for her.
    Eventually we tracked down a post-doctoral cosmology statistician (someone who calculates star movements) who had the requisite skills, and he wrote us an algorithm to build PremiumBondCalculator.com. This allows you to plug in how many bonds you have, and it will predict your likely winnings. It proves that at every value someone with typical luck will earn less than the quoted prize rate.
    (Martin Lewis, moneysavingexpert.com)

    [Response: I’m skeptical. For one thing, cosmology is not the study of star movements. For another thing, I’ve never heard of anything even close to a “post-doctoral cosmology statistician.”]

    • Horatio Algeranon

      Speaking of tracking star movements

      The Barry Bond Probability Calculator

      This allows you to plug in how many steroids you have, and it will predict your likely winnings. It proves that at every value someone with typical playing ability will earn less than the quoted prize rate.

  14. would you explain/compare the differences btw the Cowtan & Way graph here and the Cowtan & Way one in your Post 1998 Surprise post on 30 Jan?

    [Response: That one was based on projecting the pre-1998 trend into the future, to illustrate the folly of the “no warming since 1998” idea. This is based on projecting the pre-2000 trend, since 2000 is a nice round number.]

  15. Tamino,

    Thanks for the explanation. I just found it interesting that changing the ‘pre-‘ boundary by two years (98 to 00), changes the ‘post-‘ points ratio above-to-below the trend line so significantly.

  16. “for time spans less than about the 30 years which meteorology has settled on empirically, the uncertainty in trend estimates is big enough that we’re not able to distinguish between alternatives in a purely statistical sense.”

    Can you give a reference to what you are calling “empirically” settled from meteorology or climatology? I had thought the technique was similar to this, but I may be confusing memories of what I have read.

    • Please don’t say “empirically”! Fake skeptics love to talk about “empirical evidence”. Of course they reserve the right to ignore evidence they don’t like…

      • Take it up with the Big Guy; I was just quoting him. And asking what exactly it referred to– I’ve always thought the 30 years came from a similar statistical analysis.

    • Method 1:

      Take the annual readings. Take the interannual differences. Average them. Compare that to the trend you wish to discriminate with your method. Remember that the average difference will affect the change based on the square root of the number of samples (in this case, years).

      [Response: The average difference depends only on the very first and very last values. As such it’s far less than optimal.]

      Turns out generally 20 years to discriminate a trend of ~0.015C/year

      Method 2:

      Discover the longest reasonable period for any known cyclic periodicities. Double it. Averaging over two cycles will cancel out most of the false trend introduced by selecting extremes of one cycle.

      Turns out generally 30 years includes most of the decadal oscillations twice or more.

      • re: response, yes, the method is extremely crude, but it gives *an* empirical method of determining a useful climate definition.

      • Yes, that’s the analysis I was referring to that I had seen previously.
        I thought perhaps there was something different I wasn’t aware of because of the term “empirically”.

      • Barton,
        I don’t follow the logic of your demonstration… if there is a trend in the data, then the standard deviation will increase with the length of time- it will never stabilise (to see this, start at the beginning of the time series and incrementally add data, rather than going backwards as you’ve done). Obviously that doesn’t mean you can never detect a trend.

        Statistically, the length of time needed to detect a trend depends on the magnitude of the trend and the amount of noise in the data. I don’t think there are any purely statistical grounds for making a blanket statement that “you need X years of data to detect a trend”….X will always depend on the magnitude of the trend you are trying to detect.

        Of course if there are known cycles, then there would be reason to specify some minimum time period required to to be sure what you are detecting is not just part of a cycle.

  17. mgardner, “empirically” for meteorology means by looking at the actual data.

    For annual data, there is always some up and down that’s just “noise” from year to year — but when there’s no change happening other than the noise, over enough time, that averages out to zero.

    How much time? When you get one data point each year, for example:

    The less noise, the fewer annual data points you need to say statistically whether there’s likely a longterm change hiding in the up-and-down.

    The more noise, the more annual data points you need to to say whether there’s likely a longterm change.

    Robert Grumbine worked through this at a level intended for high-school students: check his topics: https://www.google.com/search?q=“Robert+Grumbine”+”trends”

    • “mgardner, “empirically” for meteorology means by looking at the actual data.”
      Are you saying Tamino is making stuff up?

      But seriously, I just thought there might be some physical phenomenon that I had missed, because of the term “empirically”.

      • “Are you saying Tamino is making stuff up?”

        No.

        Are you reading from a different universe?

      • mgardner, please also note how a question much like yours addressed to Hank when addressed to you HAS ABSOLUTELY NO UTILITY AT ALL.

        Please remember this next time you wish to ask a ridiculous rhetorical question in place of asking for enlightenment.

  18. Now that I’m confident I didn’t miss something, I’d like some feedback on this statement:
    The 17 year surface temp data is not significant with respect to the question of whether CO2 is continuing to add energy to the climate system.
    but
    It *is* significant with respect to the question of whether Ocean Heat Content has been increasing over that same time period.
    ?

    • Some feedback for you on the statement: it’s grammatical nonsense.

      Please rephrase it without the mental leaps between “with respect to” and “the question of”.

      • Wow,

        That’s the kind of silly response I expect from a Denialist. What exactly has you so upset?
        If you don’t understand my question, explain what it is you don’t understand.

        As far as I know, what I said is common usage. “A isn’t useful when answering B, but it is useful answering C.”
        Is it too hard a question for you?

      • dikranmarsupial

        I agree with Wow that the statement as written is at best grammatically ambiguous as it is not clear what the “It” at the start of the second sentence refers to. The most natural gramatical interpretation would be that “it” refers to the “17 year surface temp data [not being] significant”, however that doesn’t make scientific sense with respect to whether OHC content increasing, so I rather doubt it means that. The most sensible scientific interpretation is that the first sentence is about whether GMST trends are statistically significant and the second to do with whether OHC trends are statistically significant.

    • ???

      Here are the last 17 years of GISS data aggregated annually. There certainly seems to be a trend there (ignoring any autocorrelation).

      Y
      [1] 0.36 0.62 0.41 0.41 0.53 0.62 0.61 0.52 0.66 0.60 0.63
      [12] 0.49 0.60 0.67 0.55 0.58 0.61
      D
      [1] 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
      [12] 2008 2009 2010 2011 2012 2013
      fit=lm(Y~D)
      summary(fit)

      Call:
      lm(formula = Y ~ D)

      Residuals:
      Min 1Q Median 3Q Max
      -0.12118 -0.06397 0.00500 0.06552 0.12934

      Coefficients:
      Estimate Std. Error t value Pr(>|t|)
      (Intercept) -18.460956 8.041489 -2.296 0.0365 *
      D 0.009485 0.004011 2.365 0.0319 *

      Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

      Residual standard error: 0.08101 on 15 degrees of freedom
      Multiple R-squared: 0.2716, Adjusted R-squared: 0.223
      F-statistic: 5.593 on 1 and 15 DF, p-value: 0.03193

      • Perhaps I’m misinterpreting what Tamino said originally:

        “I would draw two main conclusions from this analysis. First, there really isn’t reliable evidence that the genuine trend (apart from the short-term fluctuations) is different from its value for 1975 to the present. Second, there really isn’t reliable evidence of a nonzero trend since 1997, in a purely statistical sense.”
        (Which is what I’ve understood all along.)

        If you disagree with Tamino, please take it up with him.

        My question is whether I can apply that data set from 1997 to the question of whether there has been an increase in OHC. The (ARGO) OHC data is of a similar duration if not exactly the same. So, whether there is *no* evidence of a non-zero trend (Tamino), or even *some* trend (Garland), I think I should be able to use it. I’m asking people who have far more expertise in statistics than I do if they agree. Why is this so confusing?

        [Response: All the conclusions drawn in this post are about the surface air temperature (SAT) data only. Conclusions about ocean heat content should be based on independent analysis of OHC data.]

      • dikranmarsupial

        mgardner wrote “So, whether there is *no* evidence of a non-zero trend (Tamino)”

        I don’t think Tamino has said any such thing. The trend is vanishingly unlikely to be precisely zero, so there is virtually always some evidence that the trend is non-zero.

        A lack of a statistically significant trend does not mean that the trend is zero, basically it just means that you can’t rule out the possibility that the trend might actually be zero (i.e. in isolation it doesn’t necessarily mean very much at all). Statistical tests *are* confusing, because they don’t give a direct answer to the question you actually want to ask. Fortunately I can recommend a very good introductory book on statistics:

        http://www.lulu.com/gb/en/shop/grant-foster/understanding-statistics-basic-theory-and-practice/paperback/product-20680689.html

    • dikranmarsupial

      I don’t see anything unusual or surprising there. GMSTs and OHC are measurements of two different aspects of climate, so there is no problem with the trend in one being statistically significant while the other isn’t.

      Perhaps if you were to give a link to the original statement so we could see the context in which it was made, that would help. It would be easier to understand what kind of feedback you were after if you were to explain what *you* find unusual about these statements.

    • Yes, now I see what the problem is.

      I am talking about using the *conclusion*…
      “Second, there really isn’t reliable evidence of a nonzero trend since 1997, in a purely statistical sense.”
      …in discussing the OHC phenomenon, because that conclusion was arrived at exclusively by evaluating the 17 year period of MST data.

      In other words, the 17 years is *independent* of the longer-term data. It isn’t about whether it’s “a pause”; it’s “17 years with no indication of a non-zero trend in those 17 years”.

      At least that’s how I’m reading it.

      • mgardner:
        I’m presuming that you didn’t do the search that Hank suggested yesterday, for “Robert+Grumbine”+”trends”? If you had, you should have come up with this link: Results on deciding trends.
        The amount of data required for a trend to be significant is not a characteristic or property of regression methods: it is a characteristic or property of the data. Regression is the tool, not the result. Apply the tool to different data, and the result will be different. You can only determine the length of period required for OHC by doing the analysis on OHC, not on surface air temperature.

      • Bob Loblaw,

        I see this as a communication problem, combined with a degree of reflexive tunnel vision for some. My first comment was unfortunate because I used the term “significant” rather than “useful”. But here I explained that I want to use the *conclusion* in *discussing* OHC. OHC increase is a physical phenomenon.

        I didn’t suggest that Tamino’s conclusion about the statistical characteristics of MST data said *anything* about the *statistical characteristics* of the OHC data. Why would you or anyone think that? I may not reason quite as linearly as some would like, but I’m not insane.

        Now, maybe I have misunderstood the conclusion:

        “there really isn’t reliable evidence of a nonzero trend since 1997, in a purely statistical sense.”

        Does it mean that the trend is zero, or does it mean that it could be anything from vertical up to vertical down? John Garland says that there *is* a trend, and shows the code.

        So John Garland is completely wrong that there is statistical significance, or Tamino is saying that there is a zero trend.

        Maybe you guys could sort that out instead of picking on me for making a simple observation about using data *specific to a time span* for reasoning *about that specific time span*.

      • ““Second, there really isn’t reliable evidence of a nonzero trend since 1997, in a purely statistical sense.”
        …in discussing the OHC phenomenon,”

        Since the data is the trend of AIR TEMPERATURES and “OHC” stands for “Ocean Heat Content”, where on earth do you get the idea that what you quote is about discussing the OHC phenomenon?

        Indeed, why do you think it is correct to phrase it “the OHC phenomenon”? You don’t have “the boiling kettle phenomenon” do you?

        Convoluted phrasing means that you’re trying to push across some thought that isn’t conveyed by more straightforward language. But here we have no idea what that is, so please explain it explicitly.

      • You really need to take stats 101. I am correct that by–quite admittedly–cherrypicking a known local min, I can generate a “significant” trend. Tamino is right in that such cherrypicking does not provide reliable evidence.

        Why do you find this so hard?

      • dikranmarsupial

        mgardner As you had difficulty communicating your questions effectively at SkS as well, can I suggest that perhaps, just perhaps, the problem doesn’t lie with ” reflexive tunnel vision” on others, but in your use of language? I would suggest that a better approach would be to try being polite, rather than rude, to those trying to understand your questions and provide the answers.

        It might also help if you actually read other peoples replies to your questions, for instance you ask:

        ““there really isn’t reliable evidence of a nonzero trend since 1997, in a purely statistical sense.”

        Does it mean that the trend is zero,”

        No, it doesn’t, as I pointed out earlier:

        “A lack of a statistically significant trend does not mean that the trend is zero, basically it just means that you can’t rule out the possibility that the trend might actually be zero (i.e. in isolation it doesn’t necessarily mean very much at all). “

      • John Garland,

        “You really need to take stats 101. I am correct that by–quite admittedly–cherrypicking a known local min, I can generate a “significant” trend. Tamino is right in that such cherrypicking does not provide reliable evidence.”

        Reliable evidence of *what*, in this case? Here’s a thought exercise:

        Let’s say that all the data pre-1997 is non-existent. I’m trying to construct a physical model of energy balance and internal energy exchanges in the climate system. I don’t expect any great precision in my predictions; I’m looking for coarse quantitative relationships.

        So, my question: Is this 17-year MST data useful for that purpose, or is it equivalent to the output of a random number generator?

        I can’t think of any clearer way to say it.

      • I really cannot help you. If you want to throw away a century and a half of work and pretend to start from scratch, do so. That’s just not how science works.

        Let’s pretend that all biology knowledge is nonexistent: How long will it take to re-establish evolution from scratch? Answer: Some while and a lot of work.

      • dikranmarsupial

        mgardner wrote: “So, my question: Is this 17-year MST data useful for that purpose, or is it equivalent to the output of a random number generator?”

        Neither, if you want to “construct a physical model of energy balance and internal energy exchanges in the climate system”, then the 17 year trend may be useful in evaluating that model, but is of no use in constructing one. However that doesn’t mean it is equivalent to the output of a random number generator either – it is the output of the physical system for which you wish to construct a model.

        I’m sorry, but yet again you are not communicating your questions very effectively. I suspect this is because you are trying to run before you can walk and need to spend more time reading and studying before trying to construct models of your own. There is no shame in this, we all have to start somewhere.

      • dikranmarsupial

        mgardner if you are interested in constructing physical models (which is a great thing to be interested in), there are a fair few books that you might find useful, such as:

        Kendal McGuffie and Ann Henderson-Sellers, “The Climate Modelling Primer”, Wiley-Blackwell, ISBN-10: 111994337X (I have the 3rd edition, but there is now a 4th).

        and

        Raymond T. Pierrehumbert , “Principles of Planetary Climate”, Cambridge University Press, ISBN-10: 0521865565

        I think reading either of these books cover to cover would be more useful than asking questions on blogs at this stage, and would be money well spent and it will save you a lot of time an aggravation.

      • John Garland et al.

        I’ve now stated the question very clearly. And having done that, I observe that you suddenly no longer wish to discuss Statistics 101, indicating that my ‘naive’ interpretation of Tamino’s evaluation holds, but you are uncomfortable admitting it.

        [Response: This is an extreme example of “non sequitur.”]

        As for actually starting at some date like 1997– are we so lacking in confidence in our theory that we think it couldn’t stand that test? Maybe the Denialists really have effectively hijacked the framing of the discussion.

        [Response: Why not just start with data since last Thursday?]

      • We’re, I think, beyond the bounds rational discourse. And, I’m unsure as to what your game is, but it is clearly a game.

        That said I am totally comfortable saying there is a statistically significant trend in some series in both 1997 and 1999 (my point) that are also not something to generalize to climate over the long haul (Tamino’s point). I would say the same about the nonsignificant trend in most or all series starting in 1998.

        Much OTHER work has shown that climate varies such that using specific short periods is simply not a valid procedure. If this bothers you, I cannot help that.

        That is all and should be sufficient.

      • mgarland,
        I think you need to distinguish whether you are talking about a “statistically significant trend” or a “climatologically significant trend”.

        If you are asking whether 17 years is sufficient time to achieve a statistically significant trend, then the simple answer is “it depends” –it depends on the magnitude of the trend relative to the amount of unrelated variation (noise). If you need a more specific answer you need to ask a more specific question – e.g. “should we be able to detect a trend of X deg. per year in a 17 year time period, given the historical level of background variation? “

        But before you ask that question, you should probably ask whether a 17 year trend would be significant (i.e. relevant) climatologically – i.e. would it tell us anything about the long term trend in the climate? The answer is probably not. You can pick out short periods (e.g. < 20 years) where there are statistically significant negative “trends” in the historical record, and there is a 35 year period from 1940 with no trend at all. But the climate has definitely warmed over the past century. And in climate model runs, periods of 20 years with apparent cooling occur within longer periods of overall warming. So whether the trend since 1997 is “statistically significant” has no real bearing on the question of whether the long term climate is responding to CO2 as expected. AGW theory does not state that there will be a monotonic rise in global surface temperatures, so it cannot be tested with short term trends in surface temperatures.

        BTW I think the main reason John Garland gets a significant trend since 1997 and Tamino doesn’t is that Tamino adjusts for autocorrelation (non-independent errors, which are assumed by the standard model used by JG) . The adjustment leads to wider confidence intervals about the trend estimate. But both Tamino’s and JG’s mean estimate of the trend since 1997 is ~ 0.01 deg/C . That is the best (albeit highly uncertain) estimate of the answer to "what is the trend since 1997" (if you must ignore all the previous data- which in this case almost certainly leads to an underestimate of the trend).

      • oops- my reply meant for mgardner (not the non-existent mgarland)

      • Jim, A-plus. Sounds like you teach.

        “AGW theory does not state that there will be a monotonic rise in global surface temperatures, so it cannot be tested with short term trends in surface temperatures.”

        I’ve been saying that since I started paying attention to this issue–quite a while. That was based on a coarse view of the magnitudes of the energy content of the parts of the system. Hence my *very first question* about the 30-year figure arrived at “empirically” which nobody was able to answer. (I think of it mostly in terms of spanning quasi-periodicity the existence of which I find convincing, like ENSO and PDO. Once they get past 10 years my confidence in such things diminishes.)

        But the question of “why not start yesterday” is a good one. I think we have entered into a new regime of climate science, where measurement (what you guys call “data”) is catching up with computational ability, and shorter term prediction of *some* phenomena (parts of the system) will be possible.

        I also think we have entered into a new regime simply because the climate *has* changed.

        Isn’t it possible that this obsession with ‘defending’ the MST long-term trend might really be counterproductive in terms of the public education effort? Many others have pointed out that if we now see 5 years of clearly rising MST, Denialists will simply start shouting “internal variability!”, but of course without providing a physical explanation.

        So, I’m looking at the OHC measurements so far, and some other things, and asking why anyone would *want* to argue that the MST for the last 17 years *isn’t* “kinda flat”, and that we should ignore it as “not significant”? As best I can tell, you can make a pretty good argument from an ensemble of shorter term observations that climate theory is doing very well indeed.

        Hence my interest in the Statistics 101 of this 17 year period, which you explain very clearly, thank you.

        The only disagreement I have is when you say “if you must ignore all the previous data- which in this case almost certainly leads to an underestimate of the trend.” That sounds like the reflexive defensiveness I am talking about. The exercise is “starting in 1997”; it stands on its own.

      • mgardner: “Isn’t it possible that this obsession with ‘defending’ the MST long-term trend might really be counterproductive in terms of the public education effort? Many others have pointed out that if we now see 5 years of clearly rising MST, Denialists will simply start shouting ‘internal variability!’, but of course without providing a physical explanation.”

        Unfortunately, there are ways to spin any condition into something that supports one’s interests with regards to public education and the shaping of public opinion. The group of people who identify themselves as climate “skeptics” have shown a complete disregard for the actual science. What they do instead is start with what their audience is probably willing to believe (given the level of education and the willingness to spend the time to think critically about a proposition). They then take single studies, observations, or statistical results and publish interpretations fitted to the interests of their employers. There is nothing that prevents them from misleading the public unless solid evidence can be found that reveals a program designed to misinform. They can always claim that “some scientists say” or “well we did say that ‘using this trend’ shows that . . .” or the Curry route “well, there’s always uncertainty.”

        The public–in general–sees only what it is shown. The public also understands only in an intuitive way how belief is shaped. The subtleties are generally invisible. For example, when a climate scientist and Anthony Watts appear on PBS, many people might say, “Well, I don’t believe that Watts guy. He’s not a scientist.” Score 1 for climate science? No. Watts has been given authority far beyond his merit, and it’s possible that even though a person might go with the climate scientist in that head to head situation, their belief in the science is weakened slightly.

        The problem is not the focus on this or that GMST trend. The problem is that people are willing to mislead the general public in the interests of a relative handful of people. If the GMST claims are not addressed, “skeptics” will say “climate scientists are unwilling to address these issues.” Yes, it’s a never-ending battle. It doesn’t matter what approaches, strategies, and/or techniques are used re the communication of the science. Liars have the entire book of tricks open to them. There are two primary ways to fight these misinformers, short of violence: 1) teach people how to think and 2) use the rule of law. Neither is a perfect solution. Each has a range of problems that the doubtful will claim renders the methods useless.

        Some people might say that what Tamino does and SkS does as useless, because the only people who read these sites are those who have read the science and those, like Watts, who are engaging in “know thy enemy.” I see the value of this site and similar sites as places where antibodies are created. I teach critical thinking. I use the progression of arguments on these sites as examples of good critical thinking. I use the comment streams of sites like WUWT as examples of failed critical thinking. The primary difference between the two is the willingness to explain choices. Here, the explanation is fully available. There, animosity and suspicion occur when one asks for such explanation. If I knew nothing about climate science, that would be enough to divvy up trust points, but I’m trained to look for subtext and the general public is not.

        It’s important to me, then, as a teacher changing the thought process of 40-80 students each semester, to have a discussion of GMST trends, to be able to say, “Look: people who express real power, who shape public opinion, are willing to show you this graph and tell you it means that global warming has stopped.” GMST is especially useful because it’s in one’s face, to some extent, every day.

      • bananastrings,

        Thanks for a polite and thoughtful reply. You say:

        “GMST is especially useful because it’s in one’s face, to some extent, every day.”

        All too true. And it’s there because right now it’s useful to the Denialists, and when it trends up they will change the subject yet again. So, while these sites do a great service, they are primarily reactive, allowing the other side to frame the debate.

        Now, you have your experience of teaching and I have mine. Mine tells me that if someone finds some of the Denialist arguments reasonable, he is not likely to be swayed by some of these more abstract concepts and arguments. Why? Because most of those arguments operate through distortions, misrepresentations, and misconceptions, *at a very basic level*. It’s like trying to convince someone that an SUV is actually less safe than a sedan– I’m sure you can articulate the psychology behind that kind of thing better than I can. “Umm. Masculine square lines! Can’t flip over!”

        You can show that graph, nicely done by Tamino, and perhaps your kind of student will ‘get it’. But the kind of student I’m thinking about will still see that “kinda flat” area at the end, and say “you want me to believe you, Mr G, or my lyin’ eyes?

        My counter to that– and it is perfectly scientifically correct– is not to act as if I’m on the whiny defensive and say “but the heat is going into the oceans” or “but it’s volcanoes”, or, worse yet, “see, it’s that statistical significance you’re not getting”. That puts me on the same footing as the Denialists. I would demote MST to its proper place as being *determined* by all these other phenomena– the volcanoes, the melting ice, the ocean heat transfer, and so on. We are fast acquiring “skill”, as they say, with all these concrete, *easily relatable* phenomena. Let’s put that together as a package and show how you get the ups and downs of average temperatures. Those students I’m talking about, believe it or not, will find that it makes sense.

        Well, I’ve offended the statistics people, so if you are offended as well by my incursion onto your turf, feel free to (attempt to) slap me down. But I am interested in your tactical opinions.

      • bananastrings, I like your comment about critical thinking in relation to GMST.

        But I do have one slight cavil about this sentence:

        “There is nothing that prevents them from misleading the public unless solid evidence can be found that reveals a program designed to misinform.”

        The cavil? I think that there’s ample evidence of just such a program. Sadly, it hasn’t made a huge difference, as far as I can tell. I wrote about one instance here; there are of course many others (such as “Merchants of Doubt”):

        http://doc-snow.hubpages.com/hub/Climate-Cover-Up-A-Review

      • Jim,

        OK, I will be more precise.

        You are defending the *choice* not to ignore the pre-1997 data. But you aren’t really defending it except to say that “if you ignore it the trend will be lower than it really is”, which is simply circular reasoning.

        I don’t know how carefully you read my last two comments, but to me what you are doing is a reflection of the Denialist argument. If the earlier data so strongly informs the later, then we must assume that the system hasn’t changed all that much physically. Then, anything that happens “happened before”, and of course for the same physical reason. It borders on the truly naive picture of a nice gradual uniform temperature increase, which sometime in the future might *cause* some negative consequence. Completely wrong.

        For me, there’s absolutely nothing about seeing a .01/year (rather than .017) change over 17 years that shakes my acceptance of the science. In fact, it reinforces it, because it fits with other phenomena that I’m now aware of.

        You correctly pointed out that we should not expect a monotonic increase in MST, I assume because you understand the nature of a complex system of this order. So, what, now we’ve slammed it with an enormous slug of (unevenly distributed) energy– is that supposed to smooth things out?

        (How this works in the public debate I will discuss if bananstrings or anyone replies on that topic.)

        [Response: Please don’t.]

    • Yes, gardner, it’s a silly question and so you get what you consider a silly response.

      However, look at the question YOU ACTUALLY ASKED and you’ll see my answer is 100% relevant and appliccable.

      It just wasn’t the answer you wanted, hence, to retain your own sense of utility, have ascribed it as “silly”.

    • mgardner
      Re: “The only disagreement I have is when you say “if you must ignore all the previous data- which in this case almost certainly leads to an underestimate of the trend.” That sounds like the reflexive defensiveness I am talking about. The exercise is “starting in 1997″; it stands on its own.”

      I’m not “defending” the pre-1997 trend, I just think there is information in the pre-1997 data that can inform our best estimate of the trend since 1997. One way would be to treat the estimated pre-1997 trend (including its uncertainty) as a prior distribution for the post-1997 trend in a Bayesian analysis.
      Another way is it to use the pre-1997 data to constrain the intercept for the post-1997 trend…so that the overall model of mean temperature is a continuous function of time.
      If you do that you get a much higher estimate of the short term trend (for 1997):

      yearX=1997
      timec=time-yearX #centre time on that year
      before=ifelse(timec>0,0,timec) #dummy variable for time before yearX
      after=ifelse(timec|t|)
      (Intercept) 0.355422 0.031340 11.341 9.57e-13 ***
      before 0.018065 0.003050 5.924 1.35e-06 ***
      after 0.016071 0.004018 4.000 0.000351 ***

      The before trend has no influence on the after trend in this analysis, except in influencing the shared intercept (endpoint).

      When you ignore all the pre-1997 data, you are effectively adding an extra intercept (or a step change) to the overall trend model, i.e.

      step=ifelse(timec|t|)
      (Intercept) 0.291404 0.042000 6.938 8.78e-08 ***
      before 0.013140 0.003684 3.567 0.0012 **
      after 0.009879 0.004770 2.071 0.0467 *
      step 0.128001 0.059389 2.155 0.0390 *

      So you would now be saying that the trends pre- and post 1997 were similar, but there was a sudden, and permanent, jump in temps in 1997. I don’t know if that’s a physically plausible model, but even it is, it still leads to the conclusion that on average ~ 0.017 degrees per year has been added to global temps since 1996 (and since I think of a trend as the average rate of change, I prefer to estimate that directly with the continuous model..I also prefer to do it in a Bayesian context, averaging over models with trend changes in all possible years…but the conclusion is much the same).

      • mgardner,
        “You are defending the *choice* not to ignore the pre-1997 data. But you aren’t really defending it except to say that “if you ignore it the trend will be lower than it really is”, which is simply circular reasoning.”

        What’s circular? I do think that “if you ignore it (previous dat) the [estimated] trend will be lower than it [the actual trend] really is”. So I think you should use all the data *to avoid being wrong*. It’s not the ‘lower than’ part that bothers me, it’s the *than it really is* part. I want to get the right answer. There will be times when ignoring the previous data will lead to an overestimate of the trend, and I will still be arguing that it’s the wrong way to estimate the trend. Even if you are comfortable with sudden, persistent jumps (step changes) in the overall trend, you would need all of the data to estimate their magnitude and timing (unless you know a priori when they should occur?). Otherwise you’re model will be missing some ‘heat’ (in the case of 1997). See also Gergs comment below (2nd para).

  19. Horatio Algeranon

    “Pausography”
    — by Horatio Algeranon

    A pause is like obscenity
    I know it when I see it
    Statistics are inanity
    To see it’s to believe it

  20. mgardner, just checking — you haven’t taken Statistics 101, right?

  21. Richard Simons

    “17 years with no indication of a non-zero trend in those 17 years”.
    Who are you quoting here, which is contradicted by the figure Tamino gave in the OP? To say that a trend is non-significant is not the same as saying that there is no trend.

  22. Mgardner, the reason I ask: “you haven’t taken Statistics 101, right?”

    Is the questions you’re asking and the rephrasings you keep suggesting are very familiar — this is the tussle people go through when they have started to understand but don’t yet grasp what statistics can and can’t be used for.

    But if you’re not already in a Stat 101 class, you ought to read the book Tamino wrote as a basic introduction. See the top of the page.

    If you are in Stat 101, or completed the course — review would help.

  23. Horatio Algeranon

    “Pausing on the Slippery Slope”
    — by Horatio Algeranon

    When standard tests are tossed aside
    Science rests on slippery slide

    • Perhaps Horatio, but observe the weakness of standard tests of trend significance in this application. As the window shortens the error bars on its “free-fitted” trend become so large as to render almost any conclusion possible.

      Statistics is the discipline by which we attempt to constrain the most powerful pattern recognition engine in the known universe. The little chunk “standard tests” can be puny. Whereat Bayes?

      • dikranmarsupial

        That [the broadening of the width of the error bars as the window shortens] is not a weakness of standard tests, it is exactly what they should do. The more evidence you have, the stronger the conclusions you can draw, and to have more evidence, you need more data. If you reduce the amound of data, you also reduce the amount of evidence available and hence can’t draw as strong conclusions from the observations. Bayesian tests are no different in this respect (in fact for a suitable choice of priors, the credible interval on the slope for Bayesian regression will coincide with the confidence interval for the trend from a frequentist analysis). The Bayesian framework on the other hand does a better job of quantifying the uncertainty in a way that directly answers the questions we most want to ask.

        The real problems with frequentist hypothesis testing is that people using them either choose H0 to be the hypothesis that they want to be true, and/or they ignore the statistica power of the test.

      • The test is weak because it fails to notice that recent data are prima facie evidence for a warming acceleration, not pause. That’s right, acceleration (21st century monthly residuals skew above the long term trend, though unlikely significantly so).

        A specific weakness I had in mind here lies with the “free-fitted” comment. The windowed trend is fitted without reference to what went before, and so will in general be offset from that. The actual offset for a post-2000 trend is substantial and positive. Effectively, the combined function* being fitted comprises a pre-2000 linear trend (the null), then a positive step change, followed (unsurprisingly!) by a much flatter post-2000 trend (which we test).

        What we call standard statistics is rooted in the first half of last century, or earlier. The discipline needs to move on. Has, I think; just not the bit we use.

        (* A total of 6 parameters there BTW: three for each trend, including the bounds. Parsimony?)

      • dikranmarsupial

        Gerg writes “(21st century monthly residuals skew above the long term trend, though unlikely significantly so)”

        Unless they actually are “significantly so”, you shouldn’t be claiming that there *is* an acceleration based on this evidence, just as others shouldn’t claim that there *is* a pause in warming unless there is significant evidence that such a pause exists (using a reasonable test).

        As I said, the result of the test is correct – we can’t rule out the existence of a pause based 17 years worth of data, because there is insufficient data (given the expected trend size and noise level). We can’t rule out the possibility that there has been no change either (or for that matter a modest acceleration).

        BTW, I doubt there is any meaningful skew in the trends based on the evidence you have presented (and I suspect autocorrelation may also be an issue here).

  24. “The only disagreement I have is when you say “if you must ignore all the previous data- which in this case almost certainly leads to an underestimate of the trend.” That sounds like the reflexive defensiveness I am talking about.”

    This sounds like a faked problem evoked to allow you to ignore the statement without having to explain why.

    Sounds like evasiveness.

  25. Horatio Algeranon

    “Uncertain Tease”
    — by Horatio Algeranon

    When something is uncertain
    It begs for speculatin’
    ‘Bout what’s behind the curtain
    Though naught may be there waitin’