# A pause or not a pause, that is the question.

UPDATE: A new post at RealClimate is very relevant, and well worth the read.

One day, a new data set is released. The rumor runs rampant that it’s annual average global temperature since 1980.

Climate scientist “A” states that there is clearly a warming trend (shown by the red line), at an average rate of about 0.0139 deg.C/yr. She even computes the uncertainty in that trend estimate (using fancy statistics), and uses that to compute what’s called a “95% confidence interval” for the trend — the range in which we expect the true warming rate is 95% likely to be; it can be thought of as the “plausible range” for the warming rate. Since 95% confidence is the de facto standard in statistics (not universal, but by far the most common), nobody can fault her for that choice. The confidence interval is from 0.0098 to 0.0159 deg.C/yr. She also adds that there’s no sign of any slowdown in the rate of warming.

Along comes a reporter for a London newspaper who proclaims, “Global warming has stopped! Call it a ‘pause’ or ‘hiatus’ or whatever you like, but the proof is in the fact that if we just look at the data since 2005 we find that the temperature in the most recent year (2014) is actually lower than it was in 2005!!!

Climate scientist A and most of her fellow climate scientists (97% of them, in fact) declare that to be nonsense. You see, you can’t draw conclusions about a trend based on such a short time frame. Besides, the reporter didn’t even compute a trend, he just drew a line from one data point to another and drew a grandiose — and mistaken — conclusion.

Meanwhile, a popular website which focuses on denying global warming announces “Bomshell! Global Warming has stopped!! Hiatus lasts at least 10 years!!!” A U.S. senator from Oklahoma hails it as proof that the entire global warming thing is a hoax, a fraud perpetrated by hollywood liberal elitists to bring about socialism based on world government. Besides, the whole global warming thing is just a fraud so scientists can get more money for their “research”.

Numerous U.S. politicians say, “I’m not a scientist, but the pause in global warming shows that doing anything about it will only ruin the economy … especially since China and India won’t do anything.

Along comes econometrician “B” who states that based on rigorous statistical analysis, the “pause” (or “hiatus” if you prefer) in warming hasn’t lasted 10 years, it’s lasted 12 years! Here’s his argument: test each possible start year from 1990 through 2010, use just the data from then on to estimate the trend (the warming rate), and estimate the uncertainty in the trend. Then compute a 95% confidence interval. Find out whether there’s some start year for which the plausible trend range includes zero. Here’s a graph of the result of applying that test; the dots (one for each choice of start year) show the estimated trend from that moment to the end, while bars extending above and below mark the 95% confidence interval. If any of the bars includes zero in the “plausible range” …

Note that starting in 2003 the “plausible range” includes zero. That means that you can’t prove it was warming using only the data since 2003 (“prove” meaning meet the standard of statistical significance). The same is true for every start year from 2003 to the final year tested. Voila! Pause since year 2003, has lasted 12 years. Heck, he even publishes his analysis in an obscure scientific journal.

The popular website lauds his efforts, pointing out that the result has actually been published in a scientific journal, and besides, econometrician B has a Ph.D. degree.

What do you think?

Perhaps you’re thinking that “Just because you can’t prove it’s been warming since 2003, that doesn’t prove it hasn’t been warming. From the look of the graph, the warming rate could plausibly be the same as it ever was. You certainly can’t prove there’s a pause at all, let alone claim that it has lasted 12 years.

Enter statistician “C” who announces “Actually, those aren’t temperature data. They’re artificial data which I created with a known warming trend of 0.012 deg.C/yr plus random noise (which in this case is the simple kind known as “white noise”). The trend didn’t change, I’m sure because the data were constructed with a constant, unchanging trend.

Now what do you think?

Perhaps you’re thinking, “Nothing even remotely like that could ever really happen.” I wish that were true.

Ross McKitrick recently published a paper in the rather obscure scientific journal Open Journal of Statistics in which he purports to show that the “pause” in the data for lower-troposphere temperature (TLT) from Remote Sensing Systems (RSS) has lasted nearly 26 years. His procedure was just as described for econometrician B, with some added complexity. For instance, he emphasizes that when computing uncertainty in a trend estimate you have to take into account something called autocorrelation, which means the noise isn’t the simple type called “white noise.” He also emphasizes (more than once) that the mathematical model for the noise which is most often used, a model called “AR1 noise,” is insufficient. Perhaps he isn’t aware that I’ve already pointed this out (see Foster & Rahmstorf).

For the RSS-TLT data set, he tries all start years from 1979 through 2009, applies his autocorrelation correction, and shows this graph:

I did the same thing using my method to correct for autocorrelation and got this (note that my units are deg.C/yr while McKitrick’s are deg.C/decade, and that I only tested start years as late as 2005 while McKitrick goes all the way to 2009):

The results are very similar, although McKitrick did make a mistake in his estimate of the uncertainty. Nonetheless, in my analysis the lower bound of the 95% confidence interval dips below zero in 1990, while in McKitrick’s it does so in 1989. That’s how he concludes that the “pause” is going on 26 years.

So, you wonder, would I protest that the “pause” in RSS-TLT has only lasted 25 years instead of 26? No.

Let’s add a dashed red line showing the estimated trend since the beginning of the data:

Every one of the 95% confidence intervals, for every choice of start year, includes the first trend estimate. None of them provides statistically valid evidence that the warming rate since any of the start years is any different than it has been all along. Bottom line: from this analysis, there’s no valid evidence that the warming rate (the trend, not the fluctuations) has changed at all.

Not only is there no valid evidence that the “pause” has lasted 25 years or longer, there’s not even any valid evidence that a “pause” happened at all. Heck, there’s not even any valid evidence that there’s been a slowdown — let alone a “pause” — in this data set.

What do you think?

### 79 responses to “A pause or not a pause, that is the question.”

1. That stats journal lists their metrics as almost 1:1 ratio of article views to downloads for articles shown on the homepage. That seems insanely high and highly unlikely. Also very uniform view/download numbers for all the articles on the homepage. Compared to other journals that list metrics, this seems to be an extreme outlier. I wonder at the statistical likelihood of these metrics being real, or their distribution being normal.

2. I think McKitrick really knows what result he wants to get.

I think your demonstration is rather elegant.

And I’m beginning to think there might actually be something going on with the RSS algorithm. It’s become quite the outlier–so much so that it’s become the favorite for ‘it’s not warming’ cherry picks lately. (Even if deniers are reluctant to openly jilt their erstwhile favorite, UAH.)

Interesting to compare RSS with UAH via Mark I Eyeball, but inconclusive, IMO:

3. Dave X

Heh – Absence of evidence is not evidence of absence? Zero trend may be credible censoring the data before 1990, but similarly, so is +0.02C/century.

I like looking at those later bars sort of as in http://andrewgelman.com/2014/11/17/power-06-looks-like-get-used/ Drawing conclusions based on stats from short series, there’s a very good chance of getting the sign wrong or underestimating significantly.

4. Tamino, I’ve been fiddling around with implementing a JavaScript webpage that performs the demonstration you do here: https://tamino.wordpress.com/2014/01/30/global-temperature-the-post-1998-surprise/ , and the standard deviation limits aren’t behaving as I expect. In particular, they’re not expanding with shorter periods for the least-squares fit.

I’m not correcting for autocorrelation, and haven’t done statistics since uni. I suspect I’ve got the calculation of standard deviation of the least-squares fit wrong. The function that performs that calculation is here: https://code.google.com/p/gw-trend-finder/source/browse/page.html#189 . Would someone be kind enough to tell me what I’ve buggered up?

5. Jiminy

What do *I* think?
I think that the term hiatus is unfortunate at best.
I think that the number of papers written on the subject by people I trust makes me uncomfortable for the number of different opinions tendered. Is there no change in trend? a probably change attributable to land based boreal winter cooling? or perhaps to recent southern pacific trends? or is it more significant that extremes continue evermore extreme, and unabated?

What do *I* think?
I think something’s going on, and it’s likely not good.
You have shown that the so-called hiatus consists as much as anything as a long run of temperatures after 1998 which were above the mean from the the prior trend, although I guess the trend after may be lower.

What do *I* think?
I see the coarse scale fluctuations in mean annual temperatures present in the IPO/PDO and wonder if the hiatus was just the PDO cool phase failing to actually cool – and if so I’d say the news is not good.
All that a downturn in mean global temperature rates of increase would mean in any event is that the Earth is shedding heat a little slower. There are two reasons for that, it seems to me. Either inbound radiation is reduced (satellites vehemently contradict this) or inbound is constant with reduced outbound, and the difference is therefore waiting to get us (and there is that pesky 1W/sq m radiative deficit that sort of supports this).

What do *I* think?
I think that in a year or two the dog whistlers will be whistling different dogs.
I think this year is brewing bad – and hot – and since the PDO has likely switched to warm phase – I think an anti-hiatus from now on is exactly as likely as the hiatus to date was real.

What do *I* think?
I think that to point at RSS Lower Troposphere Temperatures as a justification for inaction is insane. Hiatus or not – something’s broken.

6. Actually, IIRC, Spencer had some issues with the use of RSS data vs UAH , and I haven’t seen that the problem has been corrected. Not looking too hard but I have faith in McKitrick’s willingness to ignore the problems even more completely than I might.

One might wish to factor that in as well.

7. Marco

What I think?
I think you should reference Richard Telford’s blog, on which Dikran Marsupial points out another potential statistical error (albeit one of the philosophical kind) in McKitrick’s analysis:
http://quantpalaeo.wordpress.com/2014/09/03/recipe-for-a-hiatus/
Would love to hear your opinion on that one, though.

8. Along comes skeptic D and shouts “Null Hypothesis! It’s NULL hypothesis because you MUST assume a NULL trend and test for that!!!11!!!!!”, smiling smuggly, while thinking that he dealt the deadly blow to end all arguments.

P.S.: Good to see you back again, Tamino.

9. dikranmarsupial

All McKitrick needed to do to see his error is to plot the 26 year “hiatus” trend and the trend for the full dataset on the same axes to see how absurd it would be to claim there has been a 26 year hiatus:

It is pretty obvious that the trend over the last 26 years has been the same as the trend over the last 35. Some “hiatus”!

The real problem is that a failure to reject the null hypothesis does not mean the null hypothesis is probably true.

• PJKar

dikran,

That was a remarkable set of exchanges in there between Ross, yourself and the other commenters. Regrettably, he never did get around to addressing your points or for that matter related points by Richard Telford although he was asked to a couple of times. I think your comments strike at the foundation of McKitrick’s method though. Being based on the failure to reject the null it seems he has produced nothing more than the maximum time interval over which there is insufficient data to determine a statistically significant trend. As such that interval cannot be construed as a so called pause or hiatus. Or so it seems to me.

10. Lars Karlsson

From McKitrick’s paper:

“I propose a robust definition for the length of the pause in the warming trend over the closing subsample of surface and lower tropospheric data sets. The length term J_MAX is defined as the maximum duration J for which a valid (HAC-robust) trend confidence interval contains zero for every subsample beginning at J and ending at T-m where m is the shortest duration of interest. ”

That is a very weird definition of a pause.

• Dave X

That definition is of when you don’t have enough data to open your mouth.

11. Put another way, I guess its all about the choice of appropriate null hypothesis to test. Testing the null of non-zero trend is less interesting, informative and scientifically or societally useful than testing the null of a change in trend. Sadly, the non-zero null hypothesis is easier to communicate to the public and more headline grabbing.

12. Good post. I agree. It is interesting (for “interesting” maybe read “frustrating”) that many of the self-professed statistical experts seem to think that “we can’t rule out that it hasn’t warmed in the last decade” is the same as “it hasn’t warmed in the last decade”. Similarly – as you illustrate very nicely – you think they’d also be able to recognise that we also can’t rule out that the trend over the last decade has been the same as the long-term trend. This is such simple stuff, it makes you wonder why they seem so reluctant to accept it.

13. What do I think?

I think that this goes a long way to showing how, if you get in there first — especially with a lot of numbers and graphs — that you can convince anyone that black is white. And, having been convinced, it can take a lot of persuading to make folk admit that they’ve been fooled (and that, in fact, white is, err, black…). I tried much the same thing a few years back, but your offering is more on point.

What I wonder is…
… whether the so-called ‘pause’ will ever stop. The way in which the confidence interval bars extend as time goes on makes me suspect that the answer to this is ‘no’ — which would mean that this foolishness is set to continue forever, even when we’re burning up.

[Response: As we gather more data, the confidence intervals for start times we’ve already examined will narrow — but of course the confidence intervals for later start times will be very very wide.]

14. I like the post. Two things surprise me:

The uncertainty is much bigger than i assumed.

There was a comparable discussion about Hurricane trends. The question was for how long do we have to collect data to identify a valid trend. The answer was (if i remember right) 160 years. The reason for the quite long time were the big variations from year to year. Then why is there so much fight about average global temperature trend validity if you can calculate the lentgh of this period?

[Response: One of the problems is that most scientists aren’t aware that the AR1 noise model is insufficient (McKitrick was right about that). Another is that tests like his are a subtle (and almost always unintional) form of “cherry-picking.” See this.]

15. dikranmarsupial

Following from Peter Thorne’s comment, your choice of null hypothesis depends on the argument you are trying to make, in that it is the hypothesis that you need to be able to nullify in order to proceed with promulgating your theory. It is supposed to be a hurdle that prevents us from getting carried away with our enthusiasm for our theory.

If you are trying to assert that there has been a hiatus, your H0 should be that warming has continued at the same rate as before. For some reason skeptics never seem to want to do that.

Hypothesis tests are not symmetric, if you swap H0 and H1, quite often you can’t reject either of them, which implies that there isn’t enough data yet.

16. Yvan Dutil

Actually, to call a pause or hiatus, you should select a statistical significant change in the slope.

17. edaviesmeuk

Also, even if there was a statistically significant pause in the surface temperature rise that would not explain why the world wasn’t warming. The fact that we’ve very strong indications that the ocean’s are warming leaves the statistical significance of any variation in the surface temperature rate insignificant in the non-statistical sense.

In other words, the PDO or whatever could pull the surface temperatures out the of the range of variation from the trend line over the preceding 30 years without telling us anything interesting about the overall heat gain of the planet.

18. Sekerob

With the world as a whole for 2014 being on track, if not de facto already, the warmest year on record. :(

19. Tor B

I think the same ‘statistics’ that say “we might be experiencing a pause” say we might be going to H… in a hand basket more quickly then the most rabid ‘warmest’ ever preached. I once looked into the Nile (twice into denial) but didn’t stick my toe in.

20. Marlowe Johnson

I don’t think there’s any real mystery here. McKitrick was likely asked to write an editorial about the pause by the usual purveyors of FUD. The paper is merely window dressing to give the meme a veneer of credibility for motivated rubes.

21. Reblogged this on mt's Science Blog.

22. PJKar

This is a great post and a timely one as well given the attention the so called hiatus has received and the misinformation it has generated.

So if I have this straight, this is the same Ross McKitrick who years ago co-authored a paper titled “Does Global Temperature Exist?” where they basically try to argue that arithmetic mean temperature is a useless statistic for estimating global mean temperature. It was discussed at Real Climate here (I think it was brought up in this forum at some point too):

http://www.realclimate.org/index.php/archives/2007/03/does-a-global-temperature-exist/

Now here we find him computing confidence intervals for what he at one time perceived to be non-existent quantities??

It probably doesn’t matter but curiosity compels me to wonder if he ever gave up on his previous notion.

• I’ve noticed that self-consistency is not a prominent virtue among denialati.

23. Random

It always puzzles me how elaborate the discussion on the ‘hiatus’ can turn, when you look at the wrong metrics. I’m damn sure that the ‘hiatus’ is just a symptom for the fact that the *global* mean does not make for a good metrics in a situation, when three of four values go up and one goes down.

Just look at figure 6 on page 4 in Jim Hansen’s paper from early 2013:

Click to access 20140121_Temperature2013.pdf

Can anybody spot a ‘hiatus’? No? I certainly can’t. Temperatures rise consistently on the southern hemisphere in summer and winter. Temperatures rise consistently on the northern hemisphere in summer. There’s no ‘hiatus’.

Only winter in the northern hemisphere has seen a decline in temperatures – which isn’t a ‘hiatus’ in my book either.

Bottom line: Maybe it would be better to put away fancy statistics in this case and to even not argue about the length of the period they choose to look at – and just tell them, that a global mean is just a poor choice of metrics in a situation where three indicators go up and one goes down. Because it suggests a ‘hiatus’ where clearly there isn’t one.

24. I think statistics is a total sham and global warming is bullocks. It has to be. It’s 65F/18C in my apartment right now and I’m forced to wear a sweater.

25. Tamino.

As a father of a smart kid trying to find fresh ways to make explanations interesting, thank you for your consistently clear and cogent posts.

Best,

D

26. Reblogged this on Steven S Goddard (aka Tony Heller), Exposed and commented:
Nothing more annoying than when warming alarmists go all math on you. Global warming is bullocks? How do I know. Well it’s 65F in my apartment right now and I’m forced to wear a sweater.

27. Soosoos

Magnificent to have you back.

28. Reblogged this on Hypergeometric and commented:
Very, VERY well done.

29. ;) I note a Bayesian answer was left out.

While I have dealt with “the pause” in two articles (one and two), what I think is that, apart from specialists whose job it is, people are way too focused upon the energy being manifest in one component of the climate system. Just because we live there, at the surface, does not mean you can learn everything you need by just watching that part. These aren’t separate boxes.

I also think, rumbling around under the covers, there are fundamental misunderstandings about notions like confidence interval which we statisticians just have not done a good job of educating about.

Obviously, if someone does a study with malfeasance, setting out to prove something, these comments are way too charitable. For that I think just looking at the publishing track record suffices to produce a good prior.

Anyway, nicely done!

30. Gerg

RSS-TLT has actually broken a little on the low side: http://gergs.net/?attachment_id=2839 (light blue trace, last half-dozen years). There’s likely something wrong with it given that the other seven popular series have not (including UAH-TLT). Cherries, predictably.

See any pause here: Global_monthly_temps_instrumental_closeup.png ?

I wish I did…

31. Fredt34

What do I think ?

That all denyers rush to say “It paused” before 2014 final figures come out and set a new High Record.

Soon they’ll be left to silence on this lie for 4 or 5 years, until they can start claiming again “Cooling since 2014”!

32. Fredt34

Btw, it would be interesting to meta-analyze their claims about the minimal timeframe to draw a line and claim “cooling!” In their blog notes, is 15 years enough to conclude anything, or 12, or 9, or …?

So one then can use the same timeframe to show that it’s warming… Perhaps Dr Inferno will write about this some day !

33. MIchael Hauber

Since you are asking what we think…

I think that ‘statistical significance’ is sometimes overemphasised. There seems to be an unwritten assumption among some that if someone can find a statistically significant departure from trend then that would prove Co2 isn’t really warming the planet. Or that if the difference between observation is not statistically significant then it does not exist and cannot be talked about. It appears to me that at times the difference between prediction an reality has been near the borderline of statistical significance – hence posts several years back from Lucia that the trend was ‘falsified’ and counter posts here that a correct analysis auto-correlation etc puts observations back within the confidence limit. Having noted that its been a while since anyone has tried to put up a ‘IPCC falsified’ headline I assume that even under Lucia’s preferred analysis method observations lies within the confidence limit. I suppose this no warming in 26 years thing might be another such claim, but that is so outside of what I understand as reasonable I’m not spending any time whatsoever looking at what is actually being said there.

But of course the confidence limit is arbitrary, and I hope I understand correctly that a 95% confidence level simply means that the opposite result can only happen 5% of the time by chance. Why not a 98% confidence interval or 90% confidence interval? Lets pretend Lucia’s old analysis was correct and observations really did lie outside the confidence interval. Pretend that a perfect analysis would put current observations at the 99% confidence interval (i.e. 1% chance of happening by chance). Does this mean Co2 is not a warming chemical anymore? Perhaps we just happened to roll a 1 on that 100 sided dice. If we are so far outside the confidence interval that the odds of pure chance causing the result become astronomical then we know something needs to be explained. But if we are only near the confidence limit then it makes sense to look for an explanation. Has warming been slower due to a non-Co2 factor? Or is warming slower due to a Co2 factor that hasn’t been accounted for correctly in the models? The further outside the confidence interval we are the more we expect to be able to find an interesting answer to such a question, and the less expectation that the answer to such a question would be ‘we can’t find anything – its probably random noise’.

And if we are inside the confidence interval does that mean we aren’t allowed to look for an explanation? The expectation of finding something other than random noise may be a lot lower, but it may still be interesting to look, and if you do find something – well good because something interesting is learned (I hope).

Also the chart presented at Climate Progress recently was interesting. http://thinkprogress.org/climate/2014/12/03/3598698/2014-hottest-year-on-record/

Looking at the chart with the trends for el nino, neutral and la nina years I note that the last 7 years, and the year to date figure are all a little bit below the trend line appropriate to the ENSO status for each year. Seven of the previous eight years were above the appropriate ENSO trend. If we are going to call something a pause it started 8 years ago, not 16 years ago. It would seem that something a little more than pure noise is going on to cause the run of nearly 8 years in a row below trend. This is a 1 in 256 chance assuming independence. Half the odds to consider 8 in row either above or below the line. I don’t know how much auto-correlation will further reduce these odds. And of course this is not a normal analysis, and if you do x number of unusual analysis methods you’d expect some of them to give unusual coincidences by pure chance etc. So perhaps the significance of 8 in a row is less than it feels like intuitively.

• Of course, a necessary step of bringing those confidence ranges to interpretability is to assign a loss function to being wrong in either way. That is, if, say, the estimate is on the high side, assign that prediction the cost of mitigation minus the benefits that arise by doing it anyway. If the estimate is on the low side, assign that prediction the expected cost of the damage incurred from the phenomena that will be unleased. At the high end of that range, assign the costs of deciding that, given that life on the planet is now intolerable, needing to remove the carbon dioxide from atmosphere artificially.

In any event, what’s wanted is is a posterior density, weighted by loss function, for these projections.

34. Jim

I think testing null hypotheses of no trend for each possible start year is a stupid way to test for a pause, and anyone who does it is either ignorant of statistics, or is trying to deceive.
The logical way to test whether a trend has changed is with a changepoint analysis…e.g. http://www.realclimate.org/index.php/archives/2014/12/recent-global-warming-trends-significant-or-paused-or-what/

[Response: I updated this post (at the beginning) to link to that one. I have applied a similar analysis to all the best-known global temperature data sets. The result … wait for it …]

35. MMM

3 Thoughts:

1) I like the suggestion above of a Bayesian option: if I were to do this, I’d probably take a normal distribution centered on 0.2 degrees/decade as my prior (based on the AR4 model mean warming over the next couple decades… perhaps another option would be a flat distribution from, say, -0.1 degrees to 0.5 degrees), and then see how that changes with added data points.

2) “Heck, there’s not even any valid evidence that there’s been a slowdown”: As I think about this, I think that there are some subtleties in definition. I think your definition above is that it isn’t a slowdown until the uncertainty bar of the recent year is less than the long-term trend. Interestingly, that means this is something that ends up being defined retroactively: say we have a series of slow-warming years, such that in 2021 we see the uncertainty bar lower than the long term trend: when did the slowdown begin? Obviously, earlier than 2021. So, looking, for example, at the RealClimate change-point analysis: there may be no change-point identified for a given dataset, but add one year of data, and a change-point might appear at some year that already existed in the prior dataset.

Also, I might choose to define a slowdown as when the uncertainty bar is lower than the mean-warming-rate of some former interval, which would be a laxer definition than yours (e.g., it would identify slowdowns in some datasets where your definition would not see any).

3) Which leads me to thought three, which is that an interesting companion plot to your “Start Year” plots would be an “End Year” plot: what do trends look like starting with, say, 1979-1989, and then adding a month at a time? Doing an Excel based LINEST calculation, I get a local maximum trend from 1979-1999(January) of 0.0155 degrees/year. But you get another local max from 1979-2004(March) of 0.0168. Well, clearly that shows that a slowdown couldn’t possibly have started until at least 2004 in the RSS dataset, because the trend was still _increasing_ up until that date.*** 1979-2014 has a trend of 0.0122… I think it would be interesting to do an uncertainty analysis and see if the uncertainty around 0.0122 encompasses 0.0168 (and I’m guessing it does). But if, someday, the uncertainty spread drops below 0.0168… then do we date the start of the slowdown to March of 2004? That seems like one reasonable way to do things… of course, we won’t know if the slowdown started in 2004 until sometime in the 2020s, probably. (well, except I expect the rate of warming to pick back up again, so my guess is that there will turn out to have been no slowdown by this definition at all).

Well, those were some thoughts. Thanks for an interesting post!

-MMM

***I think this is an important point, actually. Anyone who claims that a slowdown started in a year before the year to which the largest linear trend in the dataset extends can NEVER be correct.

• Well, I estimated the first derivative of global mean surface temperature using a Bayesian state model, and I got something around 0.01 degrees Celsius per year increase (see here and here), which is consistent with the Cowtan and Way estimate. To me, that perfectly justifies Tamino’s “Heck, there’s not even any valid evidence that there’s been a slowdown”.

36. I really liked the change point analysis at Real Climate. Glad you put up the update. I was just coming over to post that.

37. murison

If someone in my field (astronomy) were to present such a bad (and invalid) argument as McKitrick, they’d be either laughed out of the room or flayed alive.

• Yes, funny how creationists don’t get to publish in astronomy journals.

38. The journal is not merely obscure, but is published by SCIRP, a long-time entry on Jeff Beall’s list of predatory publishers.

Earlier this year, Beall wrote Is Scientific Research Publishing (SCIRP) Publishing Pseudo-Science?
If you put SCIRP in the text box, Jeff has mentioned SCIRP often.

39. Reblogged this on Simple Climate and commented:
With 2014 looking set to be the warmest year ever (possibly by some way) I’ve been wondering what position the people claiming “global warming has stopped” might retreat to. This neat tale hints at one possibility, and explains why it wouldn’t be a convincing argument.

40. Reblogged this on thenoblegasbag and commented:
A nice explanation of why there is no evidence for a global warming pause

41. There’s one problem with your write-up, Tamino. You say that 97% of scientists recognize the the “pause” idea is all nonsense. Unfortunately, that’s not true. The use of this term — and the idea that it is statistically meaningful — runs rampant in the mainstream scientific community. I am looking forward to the not-ending of the not-pause, so we can get back to studying signal instead of noise.

[Response: I agree; sometimes I feel like a “lone voice” maintaining consistently that I’m not persuaded there even has been any pause (but I know I’m not the only one). Still, it’s a testament to climate scientists that they are keeping an open mind and trying to *understand* what’s happening.

Also, some of those natural variation factors (like el Nino/volcanoes/aerosols) can be treated as noise in the sense that they’re natural variation rather than long-term trend, but they’re still of physical origin so I’d say it’s of great benefit to understand them better.]

• jgnfld

Tamino…WELCOME BACK.

As for Eric’s comment, and along the lines of Tamino’s comment, I’ve never noticed that physicists are very strong at stats. Nor should they be. Their orientation is to model every last single factor. This is a worthy exercise, I suppose, but is really at a much finer grain of analysis than, “Is there a trend?”

Incidentally, if one takes out 7 months (Feb-Aug) from the 1998 GISS record–the height of the el Nino event–there is significant warming in the annually aggregated data (R code here: http://www.nfgarland.ca/Rcode.txt) using OLS. A physicist might well want to “explain” that 7 month event. However, for the purposes of identifying the existence of a trend, that 7 month period can very much subsumed under error variance and the deniers’ attempts to START the trend analysis from there (cherrypicking) should be subsumed under introduced but unexamined bias.

If I flip a coin 100 times, find a series of 6 heads (quite likely), and then examine the 6 following tosses after that, what are my odds of seeing a sequence of 9/12 heads? Is this unexpected at all? Misunderstanding and publishing this misunderstanding is the bread and butter of deniers.

• Statistics and physicists…

Cough, Cough…Ed Jaynes…Cough, Cough

• The coin flip analogy is a good one. One could waste a lot of time “explaining” the results. Still, Tamino is certainly right that *understanding* the physics of the not-pause might results in our learning something (already has). However, the rhetoric is still weird — the thing that is really interesting is the 1998 event itself, not “cooling” after it.

• jgnfld

Yes. The “pause” in tails would consume thousands upon thousands of pages if deniers need that pause to be true!

The bias remains forever when you use a chosen biased point to start your analysis. Even taking the next 18 coin flips would lead to an expectation of 15/24 head and 9/24 tails rather than 12 and 12. But just TRY explaining that to a denier.

• Some people seem to expect not just warming on average, but rather monotonically increasing warming.

42. dikranmarsupial

The change-point analysis is indeed interesting, but does anyone know of tools/methods for change-point analysis of auto-correlated time series. I have used some of the basic tools in R for monthly GMSTs, but they produce (obviously) spurious change-points as the least-squares regression approach underestimates the uncertainties.

• Indeed, change-point-analysis is tricky. My recommendation? Kim and Nelson’s State-Space Models with Regime Switching, or some of the techniques in Poole, West, and Harrison’s Applied Bayesian Forecasting and Time Series Analysis. The approach there is to both estimate an underlying state from noisy data and then base a notion of regime consistency upon coherent successive statistics of that state. It’s also interesting that the best algorithms, like Rauch-Tung-Striebel (see book by Sarkka), do a forward Kalman-like projection, and then smooth backwards. This means that some regime changes aren’t seen until after the fact, and that some changes, based upon forward projections, are rejected afterwards, based upon more data.

These also depend upon what is built in as features of the model: Does one want cyclic behavior of some kind or to just model step changes? That plot linked earlier is the result of an RTS applied to a limited temperature series, using a step change model and a Gaussian random walk innovator.

• dikranmarsupial

cheers, I’ll see if I can get either of those from the library!

• Jiminy

Following up “State-Space Models with Regime Switching” on Google Scholar I get linked to
[an http address]/A%20Commentary%20On%20The%20Gospel%20Of%20Matthew.pdf, then the page is made unavailable (I suppose by the Uni servers where I am).
I find this a little worrying – so beware!

It does come up on Amazon as http://www.amazon.com/State-Space-Models-Regime-Switching-Gibbs-Sampling/dp/0262112388

And are there other suggestions?

I ask because I’m doing a serious amount of work involving change-point analysis in climate data, the literature is fraught with issues, and I need to do some intense reading and thinking. It is certain that regime shifts in climate occur – the PDO was delineated in part using shift-point analyses, and it’s quite likely that 1997/8, being a PDO phase change year does in fact

I found this thesis from Lancaster Uni
Killick, R., Novel methods for changepoint problems. 2012, Lancaster University. p. 163
Related papers
Killick, R., P. Fearnhead, and I. Eckley, Optimal detection of changepoints with a linear computational cost. Journal of the American Statistical Association, 2012. 107(500): p. 1590-1598.
Killick, R. and I. Eckley, changepoint: a comprehensive changepoint analysis package for R. Journal of Statistical Software, 2014.
Killick, R., et al., Detection of changes in variance of oceanographic time-series using changepoint analysis. Ocean Engineering, 2010. 37(13): p. 1120-1126.

Lastly – in my own naive way let me suggest that
When using statistical methods in science,
1. Reality is mostly more complex that the statistical models used to explore it – therefore exploring reality guided by statistical models is a bit hazardous. It’s very likely that the “complete statistical model for your aspect of reality” isn’t amongst the models you are using.
2. This means you are mostly selecting “the most usefully approximately correct” models – and that’s a whole different ball game from null hypothesis testing.
3. Your focus is on some physical system and you are using multiple lines of evidence, so when one cites a t-test, it is often that the test itself has become one more datum. I know that my blood runs a little cold when I read something like page 561 of “Data Analysis Methods in Physical Oceanography – 3rd Ed” RE Thompson, WJ Emery, (Pub)Elsevier, which shows the results of Rodonov’s STARS test applied to PDO time series. There is a shift shown for 2012/3 – which represents just two observations after the event and P<0.01. Now for other reasons, I believe it may turn out to be a real event, which is why I say my bet is that in fifteen years or so we will look back and say that the rate of warming over those fifteen years was higher than the so called hiatus.

My bet is that the hiatus is ended, mainly on the basis that it's a lot to do with variations in the ocean draw down of warmer surface water after a pacific gyre spinup circa 1995 and its subsequent working out. (See Roemmich, D., et al., Decadal Spinup of the South Pacific Subtropical Gyre. Journal of Physical Oceanography, 2007. 37(2): p. 162-173.)
Whether I'm right or wrong – that's the fun bit. Much of my reasoning depends on statistical methods which I know are not perfect.

• Sounds to me like you want some good software, for which you understand the theory from separate publications. I can suggest a few.

The problem has been discussed generally here.

One that comes to mind is BCPA which has been applied in biology. It is derived from a doctoral dissertation, which addresses Bayesian extensions. There is a software package. I’ve put it on my pile of things to extend that to a Bayesian framework, along with moving the entire thing to Python, but I have not yet started that.

A second option, although cyclicity would probably need to be accomodated, is the Markov switching models. These are treated in the Kim and Nelson text.

I like this approach but AFAIK it does not have ready-made software, so new coding would be needed.

There are the bcp, and EBS packages.

What kind of time frame do you have? If it is sufficiently long, I may be able to help.

BTW, on an entirely different tack, while we’re talking about software, there’s software available for Bayesian uncertainty quantification and probabilistic solution of differential equations

• dikranmarsupial

Actually in science, we are usually just choosing between approximately correct models, so it is no different from statistics in that respect. Also this is no different in hypothesis testing, it is quite routine to use a null hypothesis that we know from the outset is not true, for instance “the coin is unbiased” or “this is a sample from a normal distribution” (which is only actually true asymptotically).

Statistics is a good tool for testing physical hypotheses, such as “there has been a hiatus in the rate of warming”, which is what changepoint detection would be performing in this case. The problems arise more when it is used for hypothesis generation.

• As you may know, I have severe reservations regarding the efficacy of hypothesis testing, even if there are sound Bayesian replacements (and see Kruschke’s t-test replacement, with video). But setting that aside, it seems to me formulating things like hypothesis tests is an incredibly roundabout way of getting to where you want to be. You want to compare models, and these days, that can be done directly. (We’re not doing stats for Frieden calculators any longer.) And, when it’s done, whether in a frequentist way or a Bayesian way, proper considerations for numbers of extra parameters is incorporated. (Tamino has written on that some place, I think in connection with models of SLR.) The thing is, it’s actually harder to do this right in the frequentist frame, especially for people who aren’t trained to watch out for these things. This is why I so prefer the Bayesian way, because it takes care of these matters for the most part automatically, at the expense of some additional calculation.

Consider, too, Bill Press’ Opinionated Lessons in Statistics. While Professor Press is not a “deep Bayesian” (e.g., his treatment of contingency tables, while pretty good, ends up doing something like a hypothesis test anyway), he does have great practical and teaching experience, and his lectures are fun.

• Jim

You can use WinBUGS to fit change-point models with auto-correlated errors (applying to GISTEMP data produces results that look v. similar to Fig. 4 in the RealClimate post). The reversible jump add on for WinBUGS makes it feasible to objectively identify (probabilistically) the number and location of change-points within any almost model structure you like, by incorporating a piecewise linear spline (with unknown no. and location of knots) into the linear predictor.
See this paper for some examples…(and some code).
http://www.esajournals.org/doi/abs/10.1890/09-0998.1

43. MMM

So, I decided I really liked my idea regarding “pauses” and “slowdowns”: as long as adding more data increases the trend from the start date, there is NO WAY that additional data should ever be considered part of a pause or slowdown. For RSS, the 1979-2004 trend is larger than any previous trend of more than 10 years, i.e., any trend between 1979-89 and 1979-2004 will be smaller than 1979-2004.

For GISS, UAH, and HadCRUT4, using the data available at woodfortrees because it was easy to get, the key year is 2007 (i.e., the 1979-2007 trend is larger than any previous trend). So, at most, regardless of what future temperatures do, the earliest any pause or slowdown could have started is 2008.

-MMM

• JCH

I’ve made a similar argument at Climate Etc. several times, though I use 2010.33 on GISS as that is the zenith of warming.

44. > lone voice
Not the only one; I’m glad to see reference to your work over at Azimuth
http://johncarlosbaez.wordpress.com/2014/06/05/warming-slowdown-part-2/
(for more, search: site:http://johncarlosbaez.wordpress.com tamino )

• Huh? Care to list some fake datasets in wide use?

45. Kevin

What do I think? — my last stats class was in 1987. There is too much rust on my brain. It is easy to understand how people who know very little of statistics but have a desire to see this issue “disproved” or “proven” are easily pleased by someone drawing graphs for them.

+1, Kevin. I know Science is about Nullius in verba, but it’s even more about not fooling yourself. Especially when the topic encompasses all of the natural sciences, why do so many non-experts feel confident that they’re right and the lopsided consensus of genuine experts is wrong?

That’s a rhetorical question, of course. FWIW, I mostly blame it on the lack of scientific meta-literacy, amplified by the Dunning-Kruger effect.

46. Bart

Clear to us, yes, but if you really wish to leave less room for bad will reading, perhaps Statistician “C” might omit:

“Actually, those aren’t temperature data. They’re artificial data which I created with a known warming trend of 0.012 deg.C/yr plus random noise (which in this case is the simple kind known as “white noise”). The trend didn’t change, I’m sure because the data were constructed with a constant, unchanging trend.“

It’s a strong enough element on its own, but aren’t there enough fake datasets already?

Incidentally, does anyone know what algorithm RSS uses to generate its data?

47. Bart R

When I look, following Trenberth’s approach of splitting the trends by seasons, I find a tale of two seasons, and two hemispheres.

The most extreme warming is in the Northern Hemisphere summer, with notable orbit of the winter trend around the mid-1990’s strange attractor.

In the southern hemisphere, the seasons have not so sharply diverged and the warming has been far less.

There can be a number of explanations for this very strong divergence: more land area in the north, differences in the Great Conveyor, stronger GHE where the sun’s influence is strongest, more volcanic cooling in the hemisphere with more volcanism; what cannot be escaped is that saying pause is needlessly inaccurate. There’s a strange attractor in a complex system. Temperatures will rise above and circle back toward it seasonally and epicyclically. It’s an object lesson in Chaos Theory, but it is not the end of the influence of fossil waste dumping.

48. Over at Quantpaleo McKitrick gets into it with Dikran and Richard Telford. He gives the game away when he states

There are any number of null hypotheses one can test, but I think the one of most interest is trend=0. At least that’s the one I was interested in.

Perhaps Eli might introduce the Masupial-McKitrick null hypothesis that the trend has not been greater that oh, 0.025 K/yr, Since the measured trends have pretty much all been ABOVE zero, the Marsupial-McKitrick null hypothesis cannot be rejected for longer than the raw McKitrick and global warming is out of control.

• dikranmarsupial

It is indeed giving the game away, the null hypothesis should be the one you are least interested in (being true) ;o)

• I’d go further: The null hypothesis is NEVER true. That is, it cannot be accepted, but only rejected. If you wanted to accept the hypothesis, you’d have to analyze it versus another null.

• Actually, that’s not true. It’s true if you’re doing conventional frequentist statistics. But you can accept a null if it’s posed in a Bayesian context. See https://www.youtube.com/watch?v=YyohWpjl6KU. The discussion of accepting the null begins shortly after 2:30 in the video.

49. Ken Fabian

Whichever way it’s examined, the “hiatus” or “pause” seems to disappear the closer you look, like it’s – surprise, surprise – natural variation that overlays a warming trend that’s essentially unchanged.

Tamino, I’m curious if the recent paper “Total volcanic stratospheric aerosol optical depths and implications for global climate change” – http://onlinelibrary.wiley.com/doi/10.1002/2014GL061541/abstract
has implications for the Foster and Rahmstorf – “Global temperature evolution 1979–2010” paper – due to different values for volcanic aerosols?

[Response: I haven’t looked at that paper, or their data on volcanism, but clearly it would move the most recent trend estimate upward if we increase the most recent volcanic influence. It’s probably well worth a look.]

50. John Mashey

A rough, related analysis is thsi graph, where I did regressions of {10, 15, 20, 25} years ending at a given year and plotted them, so that slopes turn into levels easier to eyeball. Unsurprisngly, the h results are consistent with tamino’s conclusion.

51. Another pause, or “pause” — tornado lull over three years.
Is this enough time for this data set to say anything about it?
——–from the Time Magazine web page: