Big Difference

Tim Curtin’s paper in TSWJ isn’t the first time he’s mis-applied the Durbin-Watson test in order to justify rejecting a regression of physical variables (he substituted regression of the differenced values, after which he found the regression not statistically significant). He also does so in this precursor, in which he explicitly states (regarding his first regression)

… the Durbin-Watson statistic at 1.313, which is well below the benchmark 2.0 …

I don’t see any other interpretation than that he was using two (in fact, two point zero) as his critical test value, which we have already mentioned is completely wrong.

Curtin claimed that the absence of autocorrelation is required for valid regression, which is also wrong. Nonetheless he uses that claim to justify requiring regression be performed on differenced variables. Curtin isn’t the first (and won’t be the last) to claim that regression of climate variables like global temperature should be done using differenced variables. For example, a recent commenter on RealClimate by the handle “t.marvell” did the same, justifying it by insisting that global temperature was not a stationary time series. Of course it’s not stationary — it shows a trend!

Neither of those individuals seems to understand the impact that first-differencing has on regression analysis, especially when the causal relationship we’re interested in has to do with the trends which are present. Let’s give that some consideration.

First let’s consider the simple case in which some variable of interest, say y (it might be global temperature), is related to some other variable of interest, say x (which might be climate forcing due to CO2). Suppose there’s a strict linear relationship between them, but the variable y also includes random noise

y_t = \beta_o + \beta_1 x_t + \epsilon_t.

The coefficients \beta_o and \beta_1 are the intercept and the slope of the line relating y to x. The quantities \epsilon_t are random noise, which may or may not be white noise, i.e., it may or may not show autocorrelation. If we regress y on x, we get estimates of the coefficients \beta_o and \beta_1. We can also compute the residuals, which are estimates of the noise values \epsilon_t, with which we can test whether or not they show significant autocorrelation. We can further test whether the regression itself is statistically significant — but if the noise shows autocorrelation that has to be taken into account when testing for such significance.

Let’s create some artificial data for an example. Let the x variable follow a straight-line trend plus just a little bit of noise:

We’ll define the variable y as a linear function of x plus white noise (with no autocorrelation)

y = 0.01 x + \epsilon.

which looks like this:

These two variables are strongly correlated, as is clear if we “normalize” them and plot them on the same graph:

Take note that the principal source of correlation is the fact that they both show a strong upward trend.

Of course, correlation isn’t causation. With this artificial data, we know that x causes y because the data were designed that way. With real climate data, we know that climate forcing causes temperature change because of the laws of physics.

If we do a linear regression of y on x we get this:

The slope of the best-fit line is 0.0103 +/- 0.0008 and the fit is certainly statistically significant. We can test for autocorrelation in the residuals, either by computing the sample autocorrelation function (indicating no statistically significant autocorrelation, which we already knew), or by performing the Durbin-Watson test. The DW statistic is 1.606, which for this sample size is also not statistically significant of autocorrelation. Bottom line: the fit is significant and the slope is correct within its error limits — in other words, we got the right answer.

Now suppose we applied Tim Curtin’s methodology. Since the DW statistic is less than 2, the regression has to be rejected in favor of regressing first-differenced variables. Therefore we define

X_t = x_t - x_{t-1},


Y_t = y_t - y_{t-1}.

Now let’s regress Y on X. That gives this:

This fit certainly doesn’t look very good. We find that it’s not statistically significant. The slope of the line is 0.03 +/- 0.09, which is actually bigger than the real slope! But the probable error is so large the result is meaningless. In other words, we got the wrong answer.

The sample autocorrelation function now indicates that there is autocorrelation in the residuals (at least at lag 1), but it’s negative. The Durbin-Watson statistic is 2.886, which once again is significant, but of negative lag-1 autocorrelation. But by Tim Curtin’s criterion we would conclude that there’s no autocorrelation, and by the logic he used in his paper we should conclude that there’s no relationship between x and y because there’s no statistically significant relationship between X and Y. Again, that’s the wrong answer.

The essential problem is that when we first-differenced the variables we removed the trend from each. But there’s another problem too. Suppose we model Y as a function of X using a straight line:

Y = \beta_o + \beta_1 X.

If we “integrate” this model (i.e., reverse the first-differencing step), we get

y = C + \beta_o t + \beta_1 x,

where C is some constant. In other words, by including an intercept (\beta_o) for the straight-line fit of the first-differenced variables, we automatically include a time trend in the undifferenced variables. But, since the variable x very nearly follows a straight line, we now have the problem of collinearity, that two of our regressors are very nearly equivalent. That can wreak havoc with regression.

If we start with our original model (which is the correct one because that’s how we designed the data), and apply first-differencing to it, we get this:

Y_t = \beta_1 X_t + \epsilon_t - \epsilon_{t-1}.

Notice: there’s no intercept term in this model. Notice also that the noise is no longer white noise, it’s first-differenced white noise (a.k.a. MA(1) noise), which is why there’s (negative) autocorrelation in the residuals of the fit of Y on X. By the way, we can fit this model (with no intercept term) to our first-differenced data. Then we get a slope estimate of 0.013, which is much closer to the right answer, but the 2-sigma uncertainty is +/- 0.015 and the fit is not statistically significant. That’s because when you first-difference the variables, but don’t confound things by inserting a spurious intercept, the uncertainty in your result is increased dramatically.

There are situations in which analyzing first-differenced variables is extremely useful, sometimes even necessary. If, for instance, the noise had a “unit root” then we might want to difference the variables. Or, we might be primarily interested in the effect of the short-term fluctuations of x on those of y, in which case we could first-difference for the specific purpose of removing the trend from each. But to know the overall impact of our artifical x on our artifical y we want to avoid differencing — because we know how the data were constructed. To know the overall climate impact of greenhouse-gas forcing on global temperature, likewise we want to avoid differencing — because of the laws of physics.

As I said, lots of folks want to first-difference climate variables, but there’s no justification for doing so. More to the point, I doubt that many of them (or perhaps even any of them) really understand the impact of what they’re doing. But they sure seem to like the answers it gives ’em.

30 responses to “Big Difference

  1. Meaty, especially for one such as meself whose statistical smarts are pretty limited. I’ll be coming back to this one.

  2. Over and over again, it seems like contrarians could really benefit from the exercise of applying their methods to artificial data.

    For the Tim Curtin’s of the world, this is easy, because it is just a global temperature trend plus noise.

    But for the more sophisticated skeptic, this may require actually climate model data. Here, I think of the various skeptics who get confused by the fact that changes in the rate of CO2 growth from year to year is correlated with temperature, and jump to the conclusion that therefore CO2 growth is natural*. Some actually try to dig into the isotope differences to add further fuel to their mental fire. But none of them take a real carbon cycle model** to see if, just maybe, this same behavior exists in computer models where we _know_ what is happening because we programmed it that way…

    *Off the top of my head I can think of Spencer, Salby, Essenhigh, all of whom have done real science in their own fields and shouldn’t have fallen for something so simple.

    **And I mean a real model, not the Bern cycle approximation, even if people like Eschenbach think that the IPCC actually bases its projections on a 4 exponential sum.

  3. Ah, but when we apply the D-K test to Tim Curtin, we find the result very significant.

  4. But if you test your analysis by applying it to artificial data, and then somebody hacks into your computer and steals the files, they can trumpet it as evidence that you are “fabricating data.”

  5. tamino-sama,

    What would happen if you applied the traditional three levels of the ADF test to the data here? Could you use that as an example of how to apply the said test? (<== hidden agenda alert)

    [Response: I don’t know, but we do know that there’s no unit root! If you want, I suggest downloading the “cadf” package for R and running it — I’m sure you can generate your own artificial data.]

  6. Three models, I mean–no intercept, intercept, and intercept plus time term.

  7. Good debunking!

    Tim Curtin is in major denial. He doesn’t understand the basic realities of the physical world and tries to manipulate data to suit his delusions. I have argued with him before about his egregiously incorrect statements about increased agricultural production from the past 30 years being solely due to CO2 increases. He fails to understand that CO2 is not the limiter of plant growth, water is. He likes to use the trope of CO2 is plant food, despite the fact that it only improves water use efficiency when all other nutrients are unlimited.

    Basically, Tim Curtin’s claims don’t pass peer review, doesn’t surprise me his maths doesn’t either.

  8. Horatio Algeranon

    “Significant Difference”

    — by Horatio Algeranon

    The difference between the science and math
    Can not be seen from just a graph
    The science requires reality’s touch
    And mathematics, not so much.

    The difference between the scientist and fake
    Cannot be seen from what they bake
    The scientist requires confirmation
    The pseudo-skeptic, mathturbation.

  9. Tom Passin

    One can put the argument into slightly different terms, which might be more familiar for some folks. Differencing is similar to differentiation, and both act as a high pass filter. This either exagerates high frequency noise or reduces (or eliminates) low frequency signals, whichever way you like to look at it. If you try to relate the filtered signals to each other, you cannot see much if any low frequency behavior, because it’s been filtered out already.

    You can’t come to any legitimate conclusions about low frequency behavior if you filter it out beforehand!

  10. From the paper: “This result validates the null hypothesis of no statistically significant influence,” and, “The regression results in the previous Section confirm the first null…”

    I’m not a statistician, but just try to be careful in my own statistical analysis. My impression was that it’s impossible to confirm/validate a null hypothesis with respect to any time series, distribution, etc. There’s always the alternative hypothesis that there’s some unknown, purely deterministic source for whatever signal you’re measuring. Alternatively, you can add unbiased noise to any data set to make it impossible to reject a hypothesis, which seems to be what’s effectively going on here.

  11. Hank Roberts


    “… a statistical test cannot prove the null hypothesis.”

  12. Good grief … did he really take the first difference of both sides of the equation?

    The point of taking differences is to undo an integration … if you differentiate both sides, you haven’t accomplished anything, other than to amplify noise and render the slope meaningless!

  13. Thanks for this post, it’s helped to further my own understanding. I tried to get Tim to understand these issues via gentle prodding here Despite my own inexperience with time-series analysis of this nature, I knew that differencing couldn’t be a correct analysis because similar simulations to those in this post led me to the understanding that differencing is a low frequency filter (a poor one at that – further investigations yielded that de-trending [ideally with splines] is more useful, and in Tim’s case, a more honest technique for his purposes). I then tried pointing Tim towards independent expert advice here with yearly CO2 and temp time series unlabelled to try and remove possible prejudice from knowing the data set, but still no understanding of the issues from Tim. Although I doubt he read the stats.stackexchange post.

  14. Tom Passin

    Here’s a nice bit about how and why you can’t prove the null hypothesis by statistical methods:

    tl;dr: to calculate a p-value you assume that the null hypothesis is true. Because it’s assumed true, you can’t turn around and use a low p-value to establish that it’s true.

  15. Hank Roberts

    somewhere out there I noticed TC now says he got something wrong but his conclusion is still correct. TC, say it here?

    [Response: Tim Curtin is no longer allowed to comment here.]

  16. OK, thanks. Deltoid will sort him out if anyone can, or tell him if he’s gotten something right; not holding breath tho’.

    It’s amazing what sometimes does get published before statisticians weigh in. My Statistics teacher, almost 40 years ago, said there was one thing we should all take away from the class: consult a statistician _before_ collecting data and trying to analyze it, not afterward.

    Because afterward, they will tell you you did it all wrong, quite likely.
    Retraction Watch

    “The statistical principles underlying the analysis [1] are literally universal. Apart from genetics, they apply to the behaviour of tiny particles (e.g. mass-velocity of atoms) and galaxies (e.g. Doppler shifts), and to analyses of the extremes of time (e.g. the speed of light and the slowest radioactive decay). An exception to these mathematical principles would shake the basis of most of modern scientific knowledge and understanding.”

  17. Tamino, I’m wondering if you can comment on this post that I stumbled upon while looking for R help on a completely unrelated subject. I don’t have sufficient statistical expertise to evaluate the merits of this:
    “The motivation is the observation that the proxies are only weakly correlated with the temperatures during the observation period 1850-2000. Furthermore, standard Lasso regression which has been used in the past for temperature reconstruction is sensitive to spurious correlations due to autocorrelated inputs. And some of the over 1200 proxies are autocorrelated. Thus, in case the relation between proxies and temperatures is indeed nonexistent, that is if proxies had no predictive power about past temperatures, the lasso would still find “relevant” predictors and will happily fit a model to the observations.

  18. oh, dear:
    claims to be peer reviewed and based on astrophysics.

  19. Hank, that article was hilarious. Spelling mistakes and odd terms used in a peer reviewed paper?? Not likely. Completely missing the basic principles of excitation and energy dispersion. Lols!

  20. thomas marvell

    If one regresses one non-stationary variable on another, one gets a spurious regression (the standard errors are much too small). In that case, one must difference the variables, unless the two variables are cointegrated. The latter means that the residual in the regression is stationary – i.e., that the two variables tend not to move too far apart over the long term. In the example in the post, the two variables would probably be cointegrated, although one would have to do a cointegration test to determine whether that is the case. Neither correlations nor cointegraton establish the causal direction, and one cannot assume that temperature changes do not cause changes in greenhouse gas levels. On all these points, see Kaufman, Kauppi, and Stock, Emissions, Concentrations & Temperature: a Time Series Analysis. Climate Change (2006) 77: 249-278.

    [Response: Neither temperature nor CO2 is a stochastic process, and the evidence I see is pretty strong that the stochastic component is stationary. But you (and your reference) are among those who think they can get to the heart of the matter by completely ignoring the most relevant scientific discipline. It’s called “physics.”

    What’s that ringing sound I hear? It’s the “clue phone.” It’s for you.]

  21. Tim is still spouting his numerology (his fundamental lack of understanding of the physics), this time over at RealClimate:

    “More generally, why is it that the expert econometricians here and at Lambert’s and Tamino’s never themselves undertake and report regressions rejecting the nul that increases in GHGs since 1958 do NOT explain temperature anomalies?”

    [Response: Is that a “triple negative”?]

    • I dunno… I get four negatives, but maybe there’s a definitional issue.

      Naive question: is it proper to define your null (or “nul,” as TC calls it) however you want to?

      (And, for that matter, why 1958?–though I have a guess…)

    • I have to say, going by the current extraordinary nonsense that he’s spouting at Deltoid, there’s a certain ugly beauty to Curtin’s bizarre mangling of basic science.

      The guy is either a magnificent idiot or a magnificent Poe.