Still Not

Those who can’t bear to believe that the laws of physics govern global temperature, still want to maintain that it’s a random walk. They base this on the fact that the ADF (Augmented Dickey-Fuller test) doesn’t reject the presence of a unit root, if you refuse to use the BIC (Bayesian Information Criterion) for model selection and you’re willing to ignore the Phillips-Perron unit root test.

One of the weaknesses of the ADF test in the presence of a trend is that it assumes the trend is linear. But it isn’t. How might we overcome that problem? There’s more than one way to skin this cat. One is to use the CADF (Covariate-Augmented ADF test), and supply a covariate to represent the trend. Namely: climate forcing. That will give us a much better picture of the right trend to use since 1880, than just a straight line, and will eliminate that weakness of the ADF test. Fortunately, the “CADFtest” package for R implements the covariate-augmented ADF test.

Another way is to use only the data since 1975, the time period during which the global temperature trend is (at least approximately) linear. Of course, we’ll be trading one weakness for another — it will eliminate the weakness of the ADF test due to nonlinear trend, but will weaken the test simply because there’s less data. The ADF test is already known not to have a lot of statistical power. But, let’s try that anyway.

We’ll start with the data from 1975 to the present. Using annual averages, that gives us a paltry 35 data points. That’s gonna make it awfully hard to reject the presence of a unit root! But when we run the ADF test in its barest form


we get

ADF test

data: x1
ADF(1) = -4.6042, p-value = 0.004328
alternative hypothesis: true delta is less than 0
sample estimates:

Unit root rejected. Resoundingly.

But wait! We might need to allow lots of lags for autocorrelation of the annual increments. Let’s allow up to 5 (a ridiculously high number with only 35 data points)


This gives

ADF test

data: x1
ADF(0) = -4.654, p-value = 0.004469
alternative hypothesis: true delta is less than 0
sample estimates:

Unit root rejected.

But wait! There’s more! I used that nasty nasty BIC (Bayesian Information Criterion) for model selection. How could I be so mean? Let’s use AIC instead:


ADF test

data: x1
ADF(4) = -3.8971, p-value = 0.02525
alternative hypothesis: true delta is less than 0
sample estimates:

This time, instead of using no lags (like BIC did), it allows 4 lags. But the result is the same:

Unit root rejected.

Suppose we try HQC?


ADF test

data: x1
ADF(0) = -4.654, p-value = 0.004469
alternative hypothesis: true delta is less than 0
sample estimates:

HQC agrees with BIC in allowing no lags. But the result is still:

Unit root rejected.

We’ve got 1 more criterion to try: MAIC


ADF test

data: x1
ADF(0) = -4.654, p-value = 0.004469
alternative hypothesis: true delta is less than 0
sample estimates:

Unit root rejected.

We don’t even have to restrict to variants of the ADF test. We can use the Phillips-Perron (PP) test:


Phillips-Perron Unit Root Test

data: x1
Dickey-Fuller = -5.2335, Truncation lag parameter = 3, p-value = 0.01

Unit root rejected.

Using only data since 1975 makes it more difficult to reject the presence of a unit root, even when it should be rejected, because of the paucity of data. Yet we resoundingly reject the presence of a unit root, regardless of how we structure the test, whether or not we allow a lot of lags, whether we do model selection by AIC or BIC or HQC or MAIC. No unit root.

And if we use the Phillips-Perron test? No unit root.

But hold on — maybe there is a way we can make the ADF test fail to reject the unit root. But before we do that, let’s look closely at the structure of the ADF test. It’s a regression of the form:

\Delta y_t = a + bt + \delta y_{t-1} + \epsilon_t + \lambda_1 \Delta y_{t-1} + \lambda_2 \Delta y_{t-2} + ... + \lambda_p \Delta y_{t-p}.

The test is whether or not the coefficient \delta is less than zero. Note that if the time series is a trend plus noise

y_t = a + bt + \epsilon_t,

then the first difference time series will be

\Delta y_t = y_t - y_{t-1} = a + bt - y_{t-1} + \epsilon_t.

Hence for a plain old trend-plus-noise, the coefficient \delta will be -1.

Now let’s do the ADF test, but instead of allowing such a ridiculously large number of lags as 5 for autocorrelation of the first differences, let’s allow a super-ludicrous-ridiculously large number of lags: 9.


ADF test

data: x1
ADF(6) = -1.5506, p-value = 0.7835
alternative hypothesis: true delta is less than 0
sample estimates:

This time, the test has allowed 6 lags of the first differences in its model selection. And it has failed to reject the null hypothesis.

But note carefully: the sample estimate of \delta is -1.017345. That’s not only negative, it’s more negative than the value it would have if the time series were a trend plus noise!

By allowing so many lagged values of the first differences into the regression equation, we have seriously weakened the ADF test. So much so, that even if the time series were trend-plus-stationary-noise, the ADF test with this many lags allowed would still fail to reject the presence of a unit root.

And that’s one of the problems with others’ application of the ADF test. Refuse to accept BIC as a model selection criterion, allow too many lags in your regression, and use only the linear-trend version of the ADF test when the trend is nonlinear, and you can weaken the test so much that it will fail to reject the presence of a unit root.

And of course, you also have to ignore the Phillips-Perron test.

But hey, let’s not totally ignore all that data before 1975 just because the trend is nonlinear! Let’s use the CADF test and supply climate forcing as a covariate. And let’s be very clear about one thing: climate forcing will affect global temperature. Unless, of course, you’re willing to deny the laws of physics.

I’ll use the climate forcing data from NASA GISS. Their climate forcing data only extend from 1880 to 2003, so that’s all the global temperature data we can use with this forcing data. Since we’re using climate forcing to represent the trend, we can not include an additional linear trend in our ADF test, so we’ll select type=”none” for the ADF test, allowing no drift and no trend. When we do the simplest form


we get this:

CADF test

data: xx ~ F
CADF(1,0,0) = -3.9521, rho2 = 0.351, p-value = 7.63e-05
alternative hypothesis: true delta is less than 0
sample estimates:

Unit root rejected. Resoundingly.

But hey, maybe we need to allow for more lags in autocorrelation of the annual increments. Let’s allow up to 12 lags (a ridiculously high number), trying this:


It gives:

CADF test

data: xx ~ F
CADF(3,0,0) = -2.6933, rho2 = 0.44, p-value = 0.006444
alternative hypothesis: true delta is less than 0
sample estimates:

Unit root rejected. Resoundingly.

But hey, I used that nasty old “BIC” as my model selection criterion. Let’s try AIC


It gives:

CADF test

data: xx ~ F
CADF(5,0,0) = -2.3539, rho2 = 0.374, p-value = 0.01607
alternative hypothesis: true delta is less than 0
sample estimates:

Unit root rejected.

What if we use HQIC?


CADF test

data: xx ~ F
CADF(5,0,0) = -2.3539, rho2 = 0.374, p-value = 0.01607
alternative hypothesis: true delta is less than 0
sample estimates:

Unit root rejected.

OK, give MAIC a shot


CADF test

data: xx ~ F
CADF(5,0,0) = -2.3539, rho2 = 0.374, p-value = 0.01607
alternative hypothesis: true delta is less than 0
sample estimates:

Unit root rejected.


The whole “unit root” idea is nothing but a “throw some complicated-looking math at the wall and see what sticks” attempt to refute global warming. Funny thing is, it doesn’t stick. In fact, it’s an embarrassment to those who continue to cling to it.

But hey — that’s what they do.


83 responses to “Still Not

  1. Nice demonstration.

    I have a question from ignorance: What do the lags represent – does a lag function as a check on linearity? And (ordinarily) how do users decide on one number of lags vs. another?

    [Response: The lags allow for cases in which the first difference time series shows autocorrelation. By including lagged values in the regression, one compensates for the autocorrelation but seriously weakens the test. The Phillips-Perron test compensates for the autocorrelation without including lagged values in the regression, which is one of its strenghts.]

  2. It appears to me that the main non-linear aspect of the trend is the occasional major volcanic eruption. Is there a way to “correct” for the eruptions and then do the AIC tests?

    I was thinking something like this:

    1) run your two-box model with all forcings –> T_all(t)
    2) run your two-box model with all forcinges except aerosols –> T_no_volc(t)
    3) calculate Delta_T = T_no_volc – T_all
    4) calculate T_adj = T_GISS + Delta_T

    too simple-minded?

  3. “Sir, (a+b)n/n = x; hence laws of physics do not exist, reply!”

  4. One minor issue; the link to NASA GISS actually goes back to this post.

    [Response: Thanks. Fixed.]

  5. I thought the “random walk” idea was a lemon. But clearly, it’s a cherry instead.

  6. One of the weaknesses of the ADF test in the presence of a trend is that it assumes the trend is linear. But it isn’t

    Oh, excellent. I’ve been pointing out to VS over at Bart’s that the trend 1880-2009 isn’t linear, which is why Bart fit an OLS to 1975-present in the first place.

    But I didn’t realize the non-linearity undermines the ADF test …

  7. Another way is to use only the data since 1975, the time period during which the global temperature trend is (at least approximately) linear. Of course, we’ll be trading one weakness for another — it will eliminate the weakness of the ADF test due to nonlinear trend, but will weaken the test simply because there’s less data. The ADF test is already known not to have a lot of statistical power. But, let’s try that anyway.

    When I asked VS to do so, his response was …

    “why throw out data?” and I repeated the non-linearity vs. linearity bit …

    and he also pointed out that he thought there wasn’t enough data to meaningfully apply the tests.

    I detected hand-waving.

    Looks like my intuition was leading me in the right direction, both in the importance of recognizing the fact that the trend is only (roughly) linear in recent decades, and VS’s smokescreen defense of not looking into it.

    Thanks for doing this, I at least am learning from the exchange. Hopefully this summer I’ll have time to poke around with R and some of the datasets and to learn a bit more so that when my bullshit detector goes off, I’ll have a better understanding of what flavor of bullshitting I’m detecting … :)

  8. OK, VS is at it again.

    Most interesting to me is that he’s totally ignoring your test using CO2 as a covariate, which tells me he wants to avoid doing any analysis informed by physics (since the goal is to prove that physics is wrong, I guess that’s understandable!). Anyway, my bullshit detector is sounding the alarm based on this alone.

    So now he claims that a Zivot-Andrews unit root test allowing for a structural break fails to reject a unit root. Therefore you’re cherry picking and “throwing away data” isn’t justified.

    I know you’re totally bored and annoyed by VS but again, your efforts are helping some of us accelerate our learning process …

    [Response: Like I said, he’s a sore loser.

    By the way, I didn’t use CO2 as a covariate but climate forcing (which includes greenhouse gases and a whole lot more).]

  9. Oh, sorry, I thought you used CO2 forcing alone (I meant to say forcing, not simply “CO2”), I should look at that dataset.

    VS is testing to see if introducing a single structural break point is sufficient.

    This guy uses an R package to explore breakpoints, and suggests that there are actually two.

    We know that early in the 20th century we saw a rise in temps associated, apparently, with increased TSI forcing and abnormally few large volcanic events (IIRC).

    Anyway, whatever … ISTM that VS should justify his assumption that testing for a single breakpoint in the data and resting his case on this is justified by the physics.

  10. CO2 forcing alone wouldn’t make sense …

    OK, you’ve used the net forcing values fed to GISS Model E then.

    Cool trick, since we’re constantly told that all this data’s hidden from other researchers and the general public! :)

  11. Okay, so can you now tell us how to perform a Phillips-Perron test? If I’m reading you correctly you don’t seem to place much faith in ADF.

    [Response: I’ve got no beef with the ADF test, but the PP test seems better. The ADF test is also known to have low statistical power when there’s strong autocorrelation. Mainly it must be borne in mind that failure of the ADF test does NOT demonstrate the existence of a unit root — contrary to the impression some give — it simply fails to reject it. When there’s little data and a nonlinear trend, that happens all too easily.

    The PP test does the same kind of regression as the ADF test, but instead of including extra regressors to account for autocorrelation, it computes a correction factor not unlike as is done for linear regression (as discussed here).]

  12. Is it possible that there is a reasonable “excluded middle” fallacy that both of you are suffering from? For instance that (based on the data) it could be consistent with a random walk or with AGW? After all we’ve had a limited amount of time and only one paralell Earth. Still given the increase, along with physical arguments, the reasonable Bayesian belief is in AGW (and the tendancy to not want to beleive in it is wishful thinking)?

    [Response: I don’t see any evidence at all for a random walk. Zero. There’s overwhelming evidence of a trend. Overwhelming. The only explanation that makes sense is what the laws of physics say must have an effect: greenhouse gases. That’s my opinion.]

    • Essentially VS’s entire argument is meant to exclude treating modern warming as fitting a linear model, so it can then be dismissed.

      He’s essentially parroting Beenstock and Reigenwertz, but I haven’t found a comprehensive rebuttal of this, either because I’ve not looked hard enough or because the physical science community looked, laughed, and moved on without bothering much with it.

      • B&R isn’t even published…it’s a paper someone put on their webside (with a rather hopeful filename, containing “nature”) and which is now waved around in the deniosphere as “the final nail in the coffin of AGW”.

  13. Philippe Chantreau

    Watch out Dhog, there is that trick word again…

  14. I’m still VERY much a newbie to the math and climate science, but have you tried the test for random walks
    with climate model output.. just a thought give that we know how they work

    • Do you mean use a relatively small sample and silly parameters to show that you can “find” a unit root anywhere if you mistreat the data enough?

      That’s a sensible idea, but it would be easier (and more compelling) to do the same test on entirely synthetic data (as Tamino has done before to make a point).

      But Tamino has wasted more than enough time on something that is daft however you look at it. Perhaps someone should suggest to VS that he rerun all his tests on some synthetic data, and see how many false positives he can get? He’s the one pushing this silly idea – so I don’t see why he shouldn’t do the work….

      • I didn’t know that tamino tested if the output from climate models acted like random walks

        my thought is that climate models are getting close enough to reality that they should be used as “dummy data” before yelling how this test or that test shows global warming is a hoax.

  15. Why not use deseasonalized monthly data from 1975. More autocorrelation sure, but also more data.

  16. Tamino, I think the following might be amenable to one of your patented statistical beat-downs.

    If I understand Roy’s new UHI theory, it’s turned around – that small population increases in sparsely populated areas cause more temp gain than actual urban areas experience from a proportionate population increase. So instead of urban heat islands, he’s talking about exurb/rural warming donuts. But the point of UHI is that the areas are small and should be discounted. But this new definition of the places showing “spurious” warming might be a large fraction of the country. I figured someone with real stats knowledge could say if that’s the case, or if there’s something more fundamentally wrong.

    • Wouldn’t one simple counterargument be that HadCRUT/GISS etc all correlate very well with satellite based observations which are unaffected by UHI?

      • Yes! I keep hammering that point home. What do people not get about the fact that satellite trends are comparable to surface trends? Why do they persist with this nonsense?

        Could it be that they know but are trying to confuse the average person?

    • Daniel the Yooper


      As a lay reader of this and other quality climate sites for the past several years, my impression is that Spencer is saying this:

      That the existence of SOME more rural temperature stations in PART of the country IMPLIES that not only MIGHT the GHCN data for PART of the US be SUSPECT, but that it then directly follows that the ENTIRE global set is ALSO SUSPECT.

      Some quotes that the Spencerites are intended to pick up on:

      “…the intensity of the warmest years is actually decreasing over time.”

      “…extrapolation of these results to zero population density might produce little warming at all!”

      “…there has been essentially no warming in the U.S. since the 1970s.”

      “…the GHCN data still has a substantial spurious warming component…”

      “There is a clear need for new, independent analyses of the global temperature data…the raw data, that is. As I have mentioned before, we need independent groups doing new and independent global temperature analyses — not international committees of Nobel laureates passing down opinions on tablets of stone.”

      The fluff quotes above will be picked up by the deniers as the Gospel for which this propaganda piece no doubt intends. Needless to say, I’m sure the last 2 words in the post will receive the least scrutiny: “Caveat emptor.”

      Which is the greatest shame of all, for they provide the needed context for the informed reader to keep in mind while digesting the post.

      Thanks for your forbearance,

      Daniel the Yooper

      • Why, yes. Yes. No people equals no warming. Why? No emissions.

        Good luck explaining this to Spencer et. al.

      • Didactylos, to be fair, I think Spencer’s point about “extrapolation to zero population density” means “extrapolation to local areas with zero population density, on a planet where people elsewhere are still emitting CO2”.

        Also, Nir, as I mentioned elsewhere on this site recently, at least some “skeptics” cite the fact that models show that the lower troposphere should be warming 1.2X faster than the surface. So if you start by assuming that the UAH LT trend is correct, then the surface trend ought to be even lower … and the fact that the surface trend is slightly higher than UAH LT is “obviously” due to UHI.

        The big problem with this is that there’s no UHI in the ocean. But ocean temps are still rising! So if you assume UAH LT is good, and the SSTs are good, and there has to be a 1.2:1 ratio of LT trend:surface trend, then the land must be cooling.

        Now, even if we assume that all our land observations are wrong and the land is actually cooling, I don’t see how you get a planet where the oceans and lower troposphere are warming but the land is cooling. That planet would have some interesting weather.

    • carrot eater

      Spencer’s treatment is oddly presented, and woefully incomplete.

      It’s interesting that he’s using a different source of raw data. But why is he only using the CRU product for comparison?

      What he should do is go to the original source, the USHCN (not the GHCN). At that point, he should compare his results to the three available USHCN datasets: raw, adjusted only for TOB, and total adjusted. This should then be done for each of his population density classes. Only then could you begin to see what is happening.

      We’ve known for a long time that adjustments are relatively large in the US data set. That Spencer finds that raw data gives a lower trend in the US than the adjusted data should not be surprising. Instead, he acts surprised by this, and seems to imply that it’s only due to the adjusted set having too many UHI-affected stations.

      He ascribes the entire difference in trend between population density classes to UHI. He needs to also look at TOB and equipment changes. Spencer may prefer simplicity, but he should not do so at the risk of missing confounding variables.

    • carrot eater

      One other note on Spencer: in his original work, he looked at the entire land Northern Hemisphere. To his surprise, his raw data trends matched the CRU’s trends exactly.

      That didn’t seem to allow for grand proclamations about the need for new independent bodies looking at temperature records, or starting over from scratch.

      So naturally, he moved on to the US, where we already knew that there is some mismatch between raw and adjusted. He re-discovered this mismatch, and then makes those grand proclamations.

      Whatever happened to the bigger picture in the Northern Hemisphere? Too inconvenient?

    • That one’s not original. They are recycling.

      And, oh yeah, at least in North America and Europe, small towns have shrunk.

      • So has anyone looked at the periodicity of denialist zombie argument attacks?

      • Small towns have shrunk especially in upstate NY, region of my favorite “reality based” institution.

      • Andrew Dodds

        That’s not entirely fair on zombies; technically it is possible to ‘kill’ (deaminate?) zombies; merely a well aimed bullet to the brain, apparently.

        Now, for zombie denialist arguments, [metaphorically] staking them down, drawing, quartering and burning the remains in public can, if performed carefully, keep them down for a few years; although as we see with, for example Irreducible Complexity in evolution, zombie arguments can never be completely eliminated. But merely decapitating a zombie argument with facts and logic has little effect. Zombie arguments rarely interact with reality and hence have little use for sensory equipment.

      • Andrew Dodds,
        Yes, the lack of central nervous system in denialist zombie arguments seems to be an advantage when it comes to longevity.

        Nick, that sounds like an interesting project. Could you get date and attribution for the arguments?

  17. I am not sure if Tamino needs to address this with statistics. How does Dr. Spencer account for all of the OBSERVED effects of global warming with his new ‘simpler’ statistical method? Are the glaciers and ice sheets melting, the seasons changing, the migratory patterns of many species adjusting (etc. etc.) just for shits and giggles?

    This is a fundamental flaw in ANY argument that tries to show that GW (in this particular case, not AGW – just GW) is not happening. Why is it that a tiny fraction of just one species is failing to recognize the obvious?

    • As Carl Sagan liked to say, “Outlandish claims* require extraterrestrial evidence” (or something like that).

      *”essentially no warming in the U.S. since the 1970s”, “most (1.71/1.98 = 86%) of the upward trend in carbon dioxide since CO2 monitoring began at Mauna Loa 50 years ago could indeed be explained as a result of the warming, rather than the other way around.”, etc

      Then again, Sagan was at one of those “reality based” institutions (Cornell), which has not only monitored temperatures continuously since the 1880’s (at Geneva Research Farm Station in NY) and found a warming trend of about 0.7C over the twentieth century) but also recognizes the reality of “biological” indicators of warming in the region — the lengthening of the growing season in New England by about 10 days from 1965 to 2000 (Climate Change And Northeast Agriculture) for example.

      But hey, who needs the birds and the bees when you’ve got UFO…er, UHI’s, right?

      (Horatio actually used to think his poems were goofy, but they ain’t got nothin’ on the claims of some of these folks)

      • Now you mention it…you might be on to something there. The genuine climate scientists are gradually being replaced by pod-people.
        My gosh, who is next? Gavin is already at NASA…………Where’s my tin-foil?

  18. Rattus Norvegicus

    OT, but your comment on McLean, el. al. has finally been accepted. The funny thing is, it will be published w/o a response from McLean, et. al. Apparently they couldn’t write a response which could get past review!

    • “Apparently they couldn’t write a response which could get past review!”

      Oh, snap!

    • James Annan says he knows it to be a fact …

      Amusingly, the comment will be published alone, without the customary Reply. Why? Because…McLean et al couldn’t muster a reply that was publishable (and not for want of trying, either – it was simply rejected).

      That’s just great.

      [Response: I too know it to be a fact.]

  19. I’m interested in the discussion on Spencer.

    I looked at the UAH trendline for the 48 states and it showed +0.22C/decade. RSS shows the same trendline. Spencer says the UHI adjusted trendline for the surface is +0.09C/decade. Even if the LT is supposed to be 1.2 times larger, it does seem to suggest a material discrepancy, and that Spencer has overstated the UHI bias.

    I appreciate Carrot Eater’s comments on what kind of data Spencer is using. I was wondering, but wasn’t clear on whether Spencer was using adjusted or raw data. If he is using data that doesn’t adjust for time of observation bias (TOB), then that would be a remarkable flaw.

    I have found this study useful in understanding what adjustments are made by USHCN and GISS to the raw US temperature data.

    Click to access 2001_Hansen_etal.pdf

    Between 1973 and 2000, the USHCN adjustments are about +0.17C (see p.18 (B)(i)). Urban adjustments are negligible on USHSCN for this time period, but shouldn’t be considered since Spencer’s results include his own urban adjustment (don’t want to double count!). GISS urban adjustustments should also be ignored.

    On a decadal trend, this +0.17C has an impact between 0.04C/decade and 0.05C/decade. So instead of +0.09C/decade, it should be +0.13C/decade or +0.14C/decade.

    • carrot eater

      Todd F:

      I could be wrong, but I am fairly certain that Spencer’s data source is raw. I highly doubt there are TOB adjustments in there. At most, there is some quality control for outliers.

      The reason I latched on to TOB is that Peterson (2003) suggests that the apparent differences between rural and urban can be more than just UHI. They can include TOB, differences in instrumentation, proximity to bodies of water, altitude, the environment very close to the site, and so on. Not all of these factors change over time, so they won’t all affect trends, but TOB certainly can.

      [Response: Isn’t Spencer using hourly data, and averaging the 00h/06h/12h/18h data? Wouldn’t that eliminate TOB effects?]

      • carrot eater

        Dear me, you’re absolutely right, Tamino.

        There is no TOB, the way Spencer is doing it. Egg on my face; that was bad on my part.

        That said, I still think he needs to compare his raw data to the USHCN’s raw data. Apples to apples. He’s assuming the difference between CRU adjusted and his raw is due to CRU not being careful in removing UHI, but that’s a terrible assumption, in the US.

    • Tamino, I would think that would eliminate the TOB effects, so that would leave only station adjustments as not accounted for, which according to Hansen’s study, would be neglibible over the 1973-2000 time period (probably not more than 0.01C per decade).

      I’ll have to do a bit of research as to why a TOB bias exists, if hourly data is available.

      • carrot eater

        TOB bias exists in the GHCN and USHCN because neither uses hourly data. The GHCN is just working with the monthly averages at each location.

        I don’t know how extensive the Spencer ISH set is. It could possibly provide a direct test-bed for testing some of the TOB adjustments that have been made. I think Vose (2003) did exactly this, but somebody could always do it again.

        But Spencer still needs to un-homogenise the CRU data, in order to compare apples to apples. Which is why he should go straight to the USHCN, which gives you the raw or TOB data needed to do the apples/apples comparison. After that, you can decide whether the homogenisation procedure is a good one, as well as look at the population density dependence of the USHCN.

        I’m not sure if even apples and apples would match in the US. They appear to, in the NH. But in a smaller subset like the US, error due to not homogenising is not thought to be random.

      • Just an aside here. But from what I understand of Dr. Spencers’ research into UHI, is that it is a side job, that is not funded, so tends to be very simple.

        Over at wuwt, where it is posted, there have been some very critical comments, and of course a lot of helpful (read very expensive and time consuming) responses also.

        IMO, which is very basic, I still think that using anomolies knocks UHI out of the game.

  20. Clark Lampson

    On the topic of Spencer’s UHI analysis, I suspect there is a problem with his pairs relative altitudes. I think he uses a very simple correction for this, but I’d be shocked if the there is not a bias to the altitude distribution (lower population density stations will be higher in altitude than nearby higher population density stations). If there is such a bias in his station pairs, then I think one needs to look a little more closely at the simple altitude correction method.

  21. To get back to the subject, Eli went and looked at the GISS data for forcings. First thing you see is that they are VERY smooth. Then you drill down, and as expected many of them do not have annual resolution, others are bare assed estimates and so on.

    The smooth nature of the estimates means that differentiation and differencing are pretty close to equivalent. The non-annual forcing records mean that, well, at least to Eli, that some more stat fiddle is needed even to start.


    Not sure how to do replies here – this is intended as a reply to Ray Ladbury’s comment March 17, 2010 at 9:03 pm

    I haven’t yet looked at periodicity of zombie argument attacks, but if a proposal currently under review isn’t funded, I’m intending to put one in to look at the percolation of obviously false talking points through the denialosphere, using web-crawling, text-mining and agent-based modelling techniques!

  23. OT, but I need help from those of you with research grant experience. I have already heard from several scientists but it would be nice to hear from a few more. I originally posted this at RC last week and am now branching out to this blog and others.

    I have a thread on my blog titled Taking the Money for Grant(ed) – Part I that
    responds to the following two claims:

    1) Scientists are getting rich from research grants!
    2) Scientists holding an anti-AGW viewpoint cannot get funding!

    I used my own recent grant experience to debunk claim #1. In a future post called Part II, I want to show examples of how grant money is spent at other institutions, especially the larger research institutions. Essentially, tell me why you are also not getting rich from your grants. You can comment on my blog or send me a private email.

    My email address is

    You can give me as much or as little detail as you think it necessary to dispel claim #1. Before I post part II, I will send a draft copy to any person whose information is being used and you will have carte blanche to edit what I had planned to post. Nothing will appear in my post that you do not confirm.

    I appreciate all the help you can offer!

  24. > See “Reply” in the time line.

    Ah, it shows up by the timestamp in faint gray type, but it’s only available at the top level comment (you can’t ‘reply to a reply’).

    So it means scrolling back up until you find it if you want to add to a branch instead of to the main topic.

    • Oh, wait, I’m wrong. I can “reply to a reply” but that’s the limit. I can’t “reply to a reply to a reply.”
      (I had to change the display colors a bit just to see the “reply” link appear clearly at all, I’d been wondering how this had changed).

      And at this level I see in a different color underscored “Click here to cancel reply.”

      • Wrong again, I am. Well, dagnabbit, some people’s timestamps have “reply” showing and others don’t. Maybe it allows only _two_ levels of reply to reply? Or I’m going blind ….
        Pardon the experimentation (sigh)

      • No, you’re not going blind.

      • Yes, WordPress has a nesting limit. (A sign of poor design?) By default it is 5, I think. Tamino seems to have it set to 3.

  25. Another way to test this (and, to me, easier to follow than the math) would be with a simulation. For example, you could simulate a Newtonian cooling process with “weather noise” and a stable equilibrium temperature, and see if it can produce 30-year trends in the order of 1.5C/century in, say, 100,000 years. You can try several different Newtonian cooling coefficients, based on a range that makes sense given observations.

    I know, Newtonian cooling is not a random walk, so we’re a priori assuming VS is wrong. But I think the point is, beyond the idea of “random walk”, to test if the accumulation of “weather noise” can produce steep trends over time, despite a physical process that tends to correct the trend.

  26. One thing I don’t get about this is if even it were some kind of random walk, then so what? Why would any kind of statistical result suspend physics?

    Lets suppose that the height of the roof of my house exhibited some kind of fluctuation which VS could characterise as a random walk in the same sense as he attempts for climate (and why not? like most things in life, the position of your roof even depends on the climate, as a colleague of mine found out when a hurricane hit his house in the 80s).

    Such a result would hardly mean that if I take a sledgehammer to the supporting walls the roof could be expected to levitate above my head.

  27. I posted this on Bart’s blog

    The statistical analysis performed by econometricians lead to two major statements. First, several tests suggest that the global surface temperature time series has an integration order I(1). Secondly, several anthropogenic radiative forcings (ARFs) has an integration order I(2). I think that these findings do not necessary contradict the current understanding of AWG.

    1. I(1) for global surface temperature
    This finding does not mean that global T is a pure unbound random walk, e.g. a deterministic linear trend also has integration order I(1). In fact this would actually fit with a linearly increasing forcing, e.g. log(CO2) in the last 50 years. Because the global T dataset is relatively short and is dominated by an increasing trend it is more likely to find a unit root, if a longer time series would be used the integration order will be I(0). The temperature over longer time scales is bound en therefore stationary (temperature can not run away because of energy conservation).

    2. The integration order of global T (I(0) or I(1)) is not the same as the integration order of the ARFs (I(2)), therefore global T cannot not be determined by ARFs.

    At first sight this statement seems valid, however the variability in global T is not only determined by anthropogenic forcings but is also determined by natural variability like ENSO, Volcanic eruptions, solar variation etc. If variability in global T (first order difference) would be purely determined by ARFs the integration orders should be the same.

    To investigate the above statement I obtained and normalized the time series for ENSO and ARFs
    sum of anthropogenic forcings :

    Using these two datasets I have created artificial temperature series using T = (1-f)*E + f*F, where I varied the relative contributions of ENSO and ARFs from pure ENSO (f=0) to pure ARF (f=1) and checked the integration order for each T-series using the matlab ADFtest allowing up to 2 lags. I also obtained and normalized the GISS global T dataset and for comparison plotted them together with the artificial T-series. Using 2 lags I also obtain an integration order I(1) for the GISS time series (with no lags I obtained I(0)).

    The results are plotted in the figure linked below:

    The results show that for f=0-0.5 the integration order is I(0), for f=0.6-0.9 the integration is I(1) and only for f=1.0 (pure ARF) the integration order is I(2). This clearly demonstrates that mixed time series will give a mixed integration order. Moreover adding only a little bit of noise to ARF already lowers the integration order from I(2) to I(1) (note the remark by Eli Rabbett). Hence the conclusions by Beenstock&Reingewertz are premature and are not supported by a more detailed analysis of the different sources of variability.

    Furthermore I’d like to note that on the time scale of decades CO2 should show at least an integration order I(1) because humans are adding CO2 incrementally to the atmosphere. This notion is consistent with global T being close to I(1), given I(1) is not a reason at all to assume global T is a pure non deterministic random walk.

    • MP,
      That looks like a nice analysis. Any chance you can publish it in an econometric journal and educate some economists?

    • MP, very nice, thanks – looking over a range of fs seems crucial and your analysis really brought some of the discussion over at Bart’s blog into focus for me.

      Could you post a link to your comment at Bart’s? Somehow I can’t find it…

    • Gavin's Pussycat

      MP, simple, beautiful and convincing. I don’t know these techniques at all, but apparently you not only do but you have a functioning intuition on them.

      Get this out somehow!

      • I agree, it really should be elevated to the level of a post – perhaps Bart V would be interested?

  28. Wow, really impressive, must admit I’m totally lost.
    Congratulations, well done!

  29. Igor Samoylenko

    Zorita has joined the discussion at Bart’s blog starting here. A few of his latest comments are posted under “Anonymous” user and are particularly illuminating: see for example this one and this one.

  30. Still trying to wrap my head around this unit root issue.

    A unit root doesn’t necessarily mean it’s random; VS has long since backpedaled from claiming it’s random to claiming the temp series contain a unit root, but are not random (i.e. there *is* a deterministic trend).

    In a comment, Alex wrote:
    “This is because the unit root is in the deterministic part of the equation and not in the random part.”
    So a unit root means that the deterministic part of the timeseries has a dependency on past values (rather than that the random part/variability has such a dependency)?

    That sounds similar to the effect of a positive feedback:

    If the climate forcing (i.e. the driving factor for the deterministic part of the timeseries) goes up, the temperatures would go up, which, in the case of a dependency on past values would cause subsequent temperatures to also go up more than they would otherwise have. (Of course, this would work in both directions; ‘up’ could be replaced by ‘down’).

    And another stats clarification question: I assume that the ‘lag’ refers to the number of timesteps that the dependency holds? (i.e. in the random walk equation, Y(t) = Y(-2) + E, the lag is 2?) Dependent on the lag time and the underlying deterministic trend, it could also cause cyclical behavior.

    But such a mathematical positive feedback in the timeseries would have to have a physical basis: A physical positive feedback. Otherwise energy balance considerations would dictate that the forcing reverses as a result.

    So just thinking out loud here, the presence of a unit root seems consistent with positive feedbacks in the climate system. Though that really depends on whether the deterministic trend already captures those feedbacks: If it does, there shouldn’t be such a dependence (mathematical positive feedback) left in the deterministic part of the timeseries. So perhaps if one finds a unit root, it signifies that the deterministic trend used to test for its presence is missing some (positive feedback) mechanism/contribution.

    More in general, if the unit root is a characteristic of the deterministic part of the timeseries, it makes it all the more important to get that deterministic part right: Use the net climate forcings and all know causes of internal variability (ENSO, etc). Since the forcings don’t translate into temperature directly, another option would be to use the modeled response to forcing instead. That may have other issues though, since models create their own weather related random variability (although I don’t quite understand how?)

    MP and Tamino in this post have taken a stab at using net forcing (which still omits known sources of variability such as ENSO). VS has contested their results, but AFAIK hasn’t used the net forcing or model output himself in his tests, whereas the choice of deterministic trend is clearly important for the test results. That, plus other stats choices to be made which are only marginally clear to me, mean that there is more ambiguity about unit roots than VS latest comment appears to allow for.

    (part crossposted at

    • Good grief, VS has really gone off the deep end. I particularly like his argument that there’s nothing unusual about 2008, if you ignore all the other datapoints leading up to it.

      In the meantime, the annual global mean temps simulated by GISS ModelE contain a unit root, according to the same tests that VS has been using.

      • That whole thread’s gone off the deep end, mostly taken over by some very familiar denialist names.

        Oh, well, it’s Bart’s blog …

      • Yes, and Willis Eschenbach sucked me in with this comment:

        We have seen no change in the rate of sea level rise (in fact the rate of rise has slowed lately), we have seen no anomalous warming, we have seen no change in global droughts, we have seen no change in global sea ice (not Arctic, global), we have seen no change in global precipitation, temperatures have been flat for fifteen years … so why do you believe that CO2 is causing changes in the climate?

        Does he really believe this despite the mounatin of data that shows otherwise? Or is he trying to mislead?

      • I think Willis and Curtin are just dishonest as hell.

  31. MartinM,

    Where is it described that the GISS modelE output contains a unit root? Probably seen it and missed it in the storm.

  32. Yeah, Willis came with some utter nonsense. I’ve so far decided to let all comments through, unless it was mere personal slander. I’m reconsidering the absence of any policy though.

  33. Finally some figures of what VS is getting at:

    It’s a total strawman comparison, of a linear trend (based on 1880-1935) plus noise (top fig) and of a stochastic trend (based on the same time period) and see how well they fare until 2008. The former of course fails, because the forcings didn’t remain the same as they were in the period 1880-1935, whereas the latter doens’t seem to have any skill (not sure if that’s the right word though).

    His exposition of the graphs is here:

  34. DeWitt Payne

    I’ve done some work with synthetic data constructed to be I(0), I(1) and I(2) with no linear trend and then added white noise to see how that affected the unit root tests:

    The PP test is the most susceptible to added noise while the ADF and KPSS tests are less susceptible, but not immune. When noise is added to the I(2) series at a similar level to the high frequency variability in the temperature record, all tests reject the presence of a second unit root in the data. Any comments would be appreciated either here or there.

  35. DeWitt Payne

    I’m having trouble with your code for running the CADF test. It’s the’ xx~F’ as the input vector. If I try to print or plot xx~F where xx is the vector of temperature from 1880-2003 and F is the net forcing vector, it tells me that it’s a formula, not a numeric data vector. Did you mean to do a fit and test the fitted values or the residuals of the fit?

  36. DeWitt Payne

    If I use yy as the input vector where:


    Then I get results similar to yours (unit root rejected at greater than 99.8% confidence) except the test is either ADF(1) or ADF(0) not CADF(1,0,0), CADF(3,0,0) or CADF(5,0,0)

  37. DeWitt Payne

    More thinking, less posting.

    The problem is that the CADF test requires that the covariates are stationary according to this paper:

    But while F does not have a unit root according to the ADF test, it isn’t stationary either according to KPSS so I’m not sure that it can be used as a covariate in the CADF test.