# Testing for Change

We got another comment from Sheldon Walker, in which he insists that there is a statistically significant slowdown in global temperature, using annual averages from NASA, starting in 2002 and ending in 2013. He provides numbers. You can read his comment here.

Sheldon is mistaken. But I’ll give him credit for this: rather than call us “warmistas” and retreating to the safety of WUWT, he came here (the “lion’s den” for climate denial) and made his case using actual numbers. And he did so in the face of, well, not “blistering” attack, but certainly not an open-armed welcome. Show some respect.

Furthermore, the mistakes he makes aren’t obvious. In fact, professionals make them, even in the context of global warming, even professionals who are not deniers. So, let’s see whether we can convince Sheldon Walker that he’s mistaken … using math.

There are two mistakes in his analysis. The first is the “broken trend” problem. Here’s the essential model he’s using for global temperature trend (NASA data, annual averages, leaving out everything before 1975 and after 2013):

This model allows for a trend change starting in 2002. But — and here’s the gist of it — it also allows for a value change in 2002, i.e. a “jump discontinuity.” There’s very good physical reasoning to disallow such discontinuity in the trend. But let’s set that aside, and suppose that such a sudden shift were physically “no problem.” That still leaves the question, how does it affect the statistics?

When you do a test for trend change, you need to allow for the extra degrees of freedom you’ve permitted by including a trend change. But this model doesn’t just include one extra degree of freedom (trend change), it includes two (jump discontinuity as well). You have to allow for that, and it affects the statistics profoundly.

There’s already a well-established statistical test for change which accounts for that: the Chow test. So I ran the Chow test, using the given data, testing for the given time of a “structural change,” and it returns a p-value of 0.18. That fails statistical significance; it’s not even close.

If you prohibit a non-physical jump discontinuity, but still allow for a trend change, it doesn’t even come close to statistical significance either. I know I’m repeating myself, I’ve discussed this whole issue before, here. But it’s important to get the truth out there.

The second problem has been called the “multiple testing problem.” It’s due to the fact that the choice of start and end times (2002 and 2013) is only one of many many possibilities. When you allow yourself many many possibilities to choose from, you increase the odds of getting an apparently significant result just by accident. Dramatically. After all, if you buy lots and lots of lottery tickets, you increase your chances of winning something just by accident. Dramatically.

I’ve dealt with this issue before, too. Those interested in a more quantitative treatment can find it here.

Perhaps the most telling result is that even if we don’t take the multiple testing problem into account (as we should), proper tests (e.g. Chow test) still don’t support the claim of a trend change. By this time it should be crystal clear: real evidence for a “pause” or “hiatus” or “slowdown” just isn’t there.

Let’s not “pile on” Sheldon Walker for making these mistakes. As I’ve said, others do too, including professionals. In fact, they’re at the heart the mistaken beliefs expressed in Fyfe et al.

OK, Sheldon, I’ve told you why you’re mistaken in terms I think are clear. Are you prepared to revise your belief?

This blog is made possible by readers like you; join others by donating at Peaseblossom’s Closet.

### 73 responses to “Testing for Change”

1. jgnfld

While I disagree that this is happening with Fyfe, et. al., at least the possibility exists in physics to move some variance from the error term to some modeled term which describes a “pause”. In my own limited physics knowledge, I don’t see it in this case. But in principle the possibility exists.

Without piling on too much, there simply is no honest, purely statistical way of doing so (i.e. demonstrating a “pause”) using just the trends in the time series data points that will satisfy anyone with a real knowledge of statistics, SW. Sorry, but there it is.

2. barry

I get a different result from SW using this app to derive trend + uncertainty.

http://www.ysbl.york.ac.uk/~cowtan/applets/trend/trend.html

GISS global trend 2002 – 2013 (incl) = 0.032 (+/- 0.177), giving

-0.145 to 0.209

This overlaps the mean trend for 1975 to 2002 (incl) and any other period. The uncertainty is rather large!

The upper limit of the 95% confidence interval for the slowdown is 0.0132

How is this value derived?

3. skeptictmac57

It’s important to give intellectually honest arguments a fair hearing, and I like the way that you deal with them.
There are two (of many) items from the SkepticalScience blog that really anchor my thinking on AGW. The first is The Escalator gif that this post reminds me of:
http://www.skepticalscience.com/graphics.php?g=47
The second is this: http://www.skepticalscience.com/scientific-explanation-climate-change-contrarians.html

“If contrarians want to argue their case, it is not sufficient to be dismissive of climate science, any more than it is appropriate to pitch opinion against theory. There is no material explanation for climate change proposed by deniers, except the magical thinking of ‘natural variation’. The only valid way to improve science is through better science, and better science is not achieved by taking a sledgehammer to the existing canon, any more than it could be improved by burning books.”

You need to provide a scientific reason for why AGW is slowing if that’s what you believe. I would welcome a slowing or leveling of warming if it were true, but I would need to have a valid reason for why that is. So far as I can see, all of the evidence points the other direction.

4. Martin Smith

I have always assumed that the “pause” believers mean there must be some unknown forcing that has caused the change in trend. They never say that, of course, because they have no viable candidate for an as yet unknown forcing, but they want to create doubt without having to back it up. But a “jump discontinuity” could happen if there was such a forcing, yes?

• elspi

No, the physics says no jump discontinuity. Temperature doesn’t work that way.

• Martin Smith

That’s not what I meant. Suppose we knew that the solar system passed into an interstellar dust cloud in 2002, and the dust reduced the amount of light reaching earth. Wouldn’t it make sense to have a discontinuity in the graph?

• DrTskoul

No. It would be a smooth transition with continuous first and second derivatives.

• Paul

It would be a smooth transition but possibly on a timescale too short to show up as such in an annual time series. That would however involve a physical phenomenon of massive scale which would almost certainly manifest itself in ways more obvious than a change in temperature – a nuclear winter perhaps.

• Greg Simpson

To get a bit silly, a large asteroid hitting Earth would probably be better modeled by a jump discontinuity than, say, a Lowess smooth. On a minute by minute basis you could see a bump from the initial fireball followed by, perhaps, a logarithmic decline. On a longer time scale it would look like a jump.

• barry

How about a *sudden* change in global cloud fraction?

• Let’s take this to a silly extreme. Say invisible space gerbils instantaneously (invisible space gerbils are all but omnipotent, so this could totally happen…) put a gigantic solar shade between the earth and the sun, we’d get a step change in the 1st derivative and an undefined 2nd derivative at that point in time. You still wouldn’t get a step change in the temperature. But you would get a kink in the temperature curve.

• Hmm, you could get the step change in temperature if the invisible space gerbils instead injected a massive amount of thermal energy into the earth system at a point in time. I suppose the asteroid strike would approximate that, if we fudge the definition of a point in time.

But the general point holds… nothing we contemplate in the context of what’s happening comes close to creating a step change in temperature.

• Martin Smith

Isn’t the asteroid-hitting-the-earth just a more extreme change in the system compared to my dust cloud example? Isn’t the statistical analysis that shows no “pause” based on the assumption that the physical climate system is fundamentally understood and did not change in 2002? “natural variation” requires that the basic system doesn’t change, but if earth passes through a dense dust cloud, that’s an extraordinary variation, which isn’t natural because it isn’t in the model.

• barry

That would however involve a physical phenomenon of massive scale which would almost certainly manifest itself in ways more obvious than a change in temperature – a nuclear winter perhaps.

I’m seeing global anomalies with annual ‘jumps’ of 0.2 and 0.3C in the graph above. The proposed ‘jump’, or step-up at 2002 is about 0.1C.

I’ll admit my statistical naivete, but I’m not seeing the physics interfere with annual swings larger than the ‘step-up’ in 2002. So we’re talking about a sustained jump which is unphysical or something else?

• Martin Smith

I think the point of claiming there was a pause was to imply there is something major that we don’t know about the physics, without having to propose a possible cause. It just creates doubt. I always thought that the pause should be refuted in two ways, first by doing the statistical analysis as Tamino has done, and second by pointing out that if a pause really did begin in 2002, then some big cause started then that we don’t know about, and you have to propose what that cause is. The pause must have a cause.

5. Tamino: “OK, Sheldon, I’ve told you why you’re mistaken in terms I think are clear. Are you prepared to revise your belief?”

[Crickets…]

No, really, there were crickets chirping at the 40th Parallel N on the night of Nov. 7!

• Philippe Chantreau

Crickets are being heard farther and farther North as time goes by it seems…

• :-)

6. Allan

barry, remember that the data here is average surface temperature but that variable doesn’t by itself come close to fully describing the physical state of the climate system. Heat moves between ocean and atmosphere, and between different levels in each, in complicated ways. Some of this leads to what we call “weather” and to essentially random fluctuations in measured average surface temperature over time, around a mean that IS dictated by the underlying state of the climate system. A step change in that underlying state would have to have different physical drivers from the internal variability (weather) that leads to changes in the surface temperature average.

Also note that that “small” 0.1 degree shift in the underlying mean in the broken trend model amounts to something like 6 years’ worth of warming in that mean (over the 1975 – 2002 period) happening in less than one year – it’s very hard to imagine a physical driver for such a change.

For an analogy, think of a dog running randomly back and forth on an invisible lead tethered to an invisible stake in the ground. If you took time lapse photos of the dog you’d see it located over different times at different points as far apart as twice the length of the lead, but for the AVERAGE position of those points to shift, even by a much smaller distance than length of the lead, the location of the invisible stake in the ground would have to move somehow, which is a distinctly different physical phenomenon from the random running of the dog.

7. Sheldon Walker

Hi Tamino,

Thanks for your explanation of why you think that I am wrong. Here is my response to your explanation.

I am not an expert on broken trends. However, I would try to apply some common sense rules.

Rule 1. If the theory supports a broken trend then a broken trend is ok.

[Response: Were you not paying attention? I *allowed* a broken trend, ignored the arguments that it’s unphysical, and it *still* fails statistical tests.]

Rule 2. If the theory does NOT support a broken trend, but the data strongly supports a broken trend, then you need to do some hard thinking.

[Response: The data does *not* support a broken trend. That’s kind of the point. Were you not paying attention?]

Rule 3. If the theory does NOT support a broken trend, and you have a broken trend with a big jump, then the broken trend should probably be avoided.

Rule 4. If the theory does NOT support a broken trend, and you have a broken trend with a small jump, then the broken trend is probably ok.

[Response: What part of “fails” do you not understand?]

Each case needs to be judged on its merits. As an example, I think that the NASA GISS broken trend example that you put in your explanation, is an acceptable broken trend. The reason is that there are plenty of year to year jumps in the data which are larger than the broken trend jump. The broken trend jump looks to be about 0.1, and there are year to year data jumps of 0.29, 0.26, 0.26, 0.24, 0.23, 0.22, etc.

[Response: There’s a tried-and-true statistical test for broken trends. It’s called the “Chow test.” It fails. Were you not paying attention?]

I agree with you that you need to allow for the extra degrees of freedom in the test. I don’t expect to get the broken trend for free.

[Response: And when you do, it fails.]

The multiple testing problem is interesting. I agree with you about increasing the odds of getting an apparently significant result by looking at many possibilities. That needs to be taken into account.

[Response: I didn’t invent it or “make it up.” It’s another long-standing result from statistics. Your “agreement” is hardly necessary.

And to top it all off, in spite of its correctness I didn’t even have to *use* it to show how your idea fails.]

When I analyse the temperature data, I normally look at around 97,000 to 300,000 trends. Am I cheating? I don’t think so, because I am not using hypothesis testing. I am not evaluating the trends in terms of being statistically significant.

The one slowdown trend that I sent you was pulled out of probably 97,000 trends. Does that mean that the trend is not valid? If I had plucked that one trend out of thin air, and sent it to you, would it be valid then? Does your decision on the validity of the trend depend upon my honesty about how I got the trend?

If I sent you that trend, and told you that I had selected it out of 97,000 trends. And then somebody else sent you the exact same trend, but said that they had only looked at one trend to find it. Would the trend be valid or not? Presumably it would be valid and invalid at the same time! (a bit like Schrödinger’s cat)

[Response: Odd that you posted previously about the *statistical significance* of your claim, but now you are “not evaluating the trends in terms of being statistically significant..” It’s clear that if statistics supports your belief you’ll say so, but if it contradicts your belief you stubbornly refuse to accept it. Very revealing.]

You said, “By this time it should be crystal clear: real evidence for a “pause” or “hiatus” or “slowdown” just isn’t there.”

I have to disagree with you strongly on this point. My method of analysing temperature data is NOT directly based on hypothesis testing or statistical significance. My method gives a logical and consistent picture of the warming rate history, and it clearly shows a slowdown. You can even have a complete Pause if you are happy with a short one. My method shows it all.

[Response: All you’ve got to fall back on is “looks like.” History is littered with the carcasses of theories that “look like” but turn out to be wrong. That’s *why* we use statistics.]

I will finish with something for you to think about. There is a pattern that I can see, which indicates a slowdown. It is centred on about 2007. There is a similar pattern centred on 1972. There is another similar pattern centred on 1982. There is another similar pattern centred on 1991. There is another similar pattern centred on 1999, but this one is a lot smaller.

[Response: You can “see”. Because it “looks like.”]

What do you think it could be? I would guess that it is a natural ocean cycle, like the PDO or AMO. It would explain why there was a slowdown, and why I can see the pattern repeated about every 10 years. What do you think?

[Response: I think the fact that you *still* refuse to accept the truth, in spite of admitting you don’t have the statistical chops, and talking of not doing statistical testing as though it were a point of pride, is the reason we call you a denier.

Your response is pathetic. You asked me previously whether there was any evidence I would ever accept for a slowdown. I answered you plainly. Now it looks like you’re the one who won’t accept evidence, no matter what, that your slowdown idea is wrong. Thank you for making it so obvious.

You’re not fooling anybody except yourself.]

• jgnfld

Re.: “When I analyse the temperature data, I normally look at around 97,000 to 300,000 trends. Am I cheating? I don’t think so, because I am not using hypothesis testing. I am not evaluating the trends in terms of being statistically significant.

The one slowdown trend that I sent you was pulled out of probably 97,000 trends. Does that mean that the trend is not valid? If I had plucked that one trend out of thin air, and sent it to you, would it be valid then? Does your decision on the validity of the trend depend upon my honesty about how I got the trend?”

If you examine 97,000 trends looking for just about any pattern, you are..well…er…more likely to find what you are looking for. Therefore no, your inference is not valid.

Let’s look at 97000 sets of 20 coin tosses in R (as doing this on a spreadsheet will take a while to compute, probably, and the results would be hard to see):

> # set seed so this particular example can be replicated
> set.seed(100)
> # create 20×97000 matrix of Heads (“H”) and Tails (“T”)
> flips #search for any columns which have all heads
> result which(result)
[1] 653750

Wow!!! Look at column #653750–about a 1 in a million shot!!!!

According to your stated “logic” that no hypothesis testing is necessary to “see something special”, coin toss set # 653750 is somehow very, very special as every value is the same!!!. What exactly makes coin toss set #653750 so special? NOTHING. Nothing at all. Nil. Nada. Zilch. It is one permutation of 20 coin tosses that overwhelmingly likely shows up merely because I happened to look at 97000 different permutations and darned if one didn’t show up. There could be a causal problem with the random number generator in R. A malicious programmer could have programmed the sample() and apply() functions to produce this result in this particular case. But these are really not very likely.

Are you “cheating” to impute any causality whatever to coin toss set #653750 just because you supposedly aren’t doing any hypothesis testing? Yes. Blatantly. Openly. Naively/ignorantly/maliciously (depending on the reason behind your mistake).

• In fact, this very problem is a constant issue in particle physics, as I understand it:

Five sigma…

“Does your decision on the validity of the trend depend upon my honesty about how I got the trend?”

Perhaps. But that isn’t a matter of the actual validity of the trend; it’s a matter of the vulnerability of the ‘you’ addressed to intentional deception.

Funny how that came up in this context.

• jgnfld

WordPress formatting made code weird. Here is R code only w/ no comments…

set.seed(100)

flips <- matrix(sample(c("H", "T"), 970000*20, TRUE),20)

which(apply(flips,2,function(z) (length(unique(z))== 1 && z[1]=="H")))

[1] 653750

• Sheldon Walker

@jgnfld

You said, “If you examine 97,000 trends looking for just about any pattern, you are..well…er…more likely to find what you are looking for. Therefore no, your inference is not valid.

I can only find a pattern if it really exists.

[Response: I’ll stop you right there. You *already* “found” a pattern that isn’t there. You fed your confirmation bias by “confirming” it with a naive analysis which ignores crucial issues — that I’ve explained to you — pertinent to statistics, which you *admit* you don’t understand. Yet you still refuse to admit you got it wrong.

That’s why you are a denier.

You do technical work, you’re probably very good at software engineering, but that doesn’t make you a statistician. You don’t have the skills needed to determine trend changes in time series, but you refuse to accept the results of those who do.

That’s why you are a denier.]

I am not making up patterns, like your coin tosses. I am effectively studying history (temperature history). I am dealing with facts, not fiction. So my inferences MAY be valid.

If you think that your coin tossing experiment proves anything, then why don’t you solve global warming by tossing a few more heads.

You said, “Are you “cheating” to impute any causality whatever to coin toss set #653750 just because you supposedly aren’t doing any hypothesis testing? Yes. Blatantly. Openly. Naively/ignorantly/maliciously (depending on the reason behind your mistake).”

You have deliberately designed a coin tossing experiment which randomly chooses a head or a tail. There is no causality by design. Is this how you think that temperature works?

If I find a slowdown trend, then guess what, that trend is actually in the temperature data. You can check my calculation to prove that I am not lying. We can argue over the interpretation of that trend, but you shouldn’t be a denier about the trend.

[Response: Nobody here is being a denier about the trend. We’re embracing the *evidence*. We’ve shown, rigorously, that the argument you presented is invalidated by statistical pitfalls that create the results you got when the data are nothing but random noise. Random noise is not a trend.

You haven’t yet presented even a smidgen of evidence to suggest, let alone demonstrate, that this is anything but true, but you stubbornly cling to your mistake. That’s why you are a denier.]

• jgnfld

Wow! Tamino has replied to most points. Let me just add…

“I can only find a pattern if it really exists.”

NO one in the history of statistics has ever “found” a pattern. They have _inferred_ a pattern to some level of uncertainty based on a clear chain of reasoning. You have inferred one based on a clear chain of nonreasoning. There is an important difference here.

“I am not making up patterns, like your coin tosses. I am effectively studying history (temperature history). I am dealing with facts, not fiction. So my inferences MAY be valid.”

Coin tosses whether virtual or physical are not “made up”. They are an historical record of a virtual or physical process and factual in every sense of the word as you are using it. In fact, unlike the data going into the annual global temp series, tosses are measured with little or no observational error. So at minimum my inferences are every bit as valid as your own, possibly more so.

“[In the given example…] There is no causality by design. Is this how you think that temperature works?”

Over the short term in a noisy system? HELL yes. Of course I do. And you would think that too if you understood the slightest thing about noise and trend. A different kind of thermal noise is in fact used by banks to generate physically-based random numbers to protect your banking security under SSL protocols. You’d better hope thermal noise is actually random.

I found a coin that clearly exhibits a pattern. Coin #653750. It’s, as you say, actually there in the record. I only found it “because it exists”. And, that pattern is, as Tamino points out, precisely meaningless.

Question for you: Which Lotto ticket is more likely to win the jackpot in a pick 7 from 49 lotto system:

21-22-23-24-25-26-27 or
2-13-21-24-37-40-43?

• Chris O'Neill

I am not making up patterns, like your coin tosses.

Yet another logical fallacy from Sheldon, that coin tosses have patterns.

This is one of the gamblers fallacies that there are patterns in random data. A common example is that gamblers think they can see patterns in poker machine outputs.

I wonder if Sheldon likes playing poker machines?

• jgnfld

Chris: Interestingly, there ARE “patterns” (to the eye) in all random data. That, for example, is why people are generally bad at generating random numbers themselves: They tend to self-censor any patterns. This makes it so easy for an expert to discriminate human generated from physically/virtually generated random numbers that it is a well known classroom demonstration that many of us here have probably performed.

• Regarding patterns in random data, here again is one of my favorite examples.
Consider the following sequence of ordered pairs:
1,2
2,7
3,1
4,8
5,2
6,8
7,1
8,8
9,2
10,8

What is the 11th pair in the series.

The answer is 11,4, since the “y” values are simply the digits of the transcendental number e, and the “x” values are their ordinal positions. Our brain is the most efficient system ever developed for spotting patterns–it sees them whether they are there or not.

• Sheldon,
Again, part of the problem might be that we are shouting past each other. Define “slowdown”. What, specifically, do you mean by this term. If all you mean is that if you search the climate record and you are able to find subintervals where you can draw line segments with different slopes, then that is true–utterly trivial, but true. In fact, all you have done is rediscover the Skeptical Science Escalator:
http://www.skepticalscience.com/graphics.php?g=47
All it means is that you have a noisy dataset in which there are different reservoirs for the system energy, and you just happen to be measuring the noisiest (e.g. the atmosphere). It does not indicate a change in the dominant physics that explains the dominant warming trend. However, how the different energy reservoirs in the climate partition energy–that is interesting, and it is an object of study. It just has relatively little to do with climate change to first order.

8. RW

I am not an expert on X. However, I would try to apply some common sense rules.

I can’t think of any value of X where this would lead me to believe that what comes next is credible. At least the first two “common sense rules” make some kind of logical sense, though they jump the gun by presuming the existence of what you’re trying to prove exists. The third and fourth “common sense rules” don’t even make sense.

Do you have any scientific or mathematical training, Mr Walker?

• Sheldon Walker

@RW

You said, “Do you have any scientific or mathematical training, Mr Walker?”

Yes, science and maths have always been my favourite subjects. I specialised in science and mathematics from my second year at high school. The scholarship that I was awarded in order to go to university involved exams in chemistry, physics, biology, maths, and english.

I have 2 university degrees, but science and maths were not the major subjects.

My primary career is in information technology. I develop software, test software, and do data analysis.

My favourite software is Excel. I use it nearly every day. I use Excel when I analyse 300,000 temperature trends, and produce a graph.

I enjoy science and maths so much, that they are also my hobby. For example, I have never been very happy with the calculation of the temperature of the earth with no atmosphere. You know the one, where they get an average temperature of about -18 degrees Celsius. It seemed to me that there were too many assumptions in the calculation. It assumed a non-rotating planet. The incoming energy was absorbed by the profile of the planet, and radiated evenly from the 3D surface. It should have been hotter at the equator, etc

So I simulated a rotating planet in Excel, and guess what. I got an average temperature of about -21 degrees Celsius. So the original calculation wasn’t that bad. The assumptions must have balanced out.

My simulation could tell me the temperature at different latitudes. It turned out that at the equator it was above zero degrees Celsius for about 6 hours per day. So it wasn’t an ice cube earth, it was a slush-ball earth. This gives the possibility of liquid water and life, even without greenhouse gases.

Next I decided that I had better simulate the greenhouse effect. I was interested in what proportion of outgoing energy would have to be returned to the surface of the earth, in order to bring the temperature up to an average of 14 degrees Celsius. Would I need to return 10%, 50% or 90% of the outgoing energy.

It turned out that returning 30% of the outgoing energy would raise the average temperature of the earth to about zero degrees Celsius.

Returning 41.2% of the outgoing energy would raise the average temperature of the earth to about 14 degrees Celsius.

Returning 42% of the outgoing energy would raise the average temperature of the earth to about 15 degrees Celsius.

So there you have it. The approximately 1 degrees Celsius increase in temperature over the last 100 years or so, is caused by an increase of just 0.8% in the amount of the outgoing energy returned to the surface of the earth. This is presumably caused by the 120 ppm increase in the CO2 level, although I am sure that other things like methane are contributing.

• Martin Smith

Now you claim you have discovered a change in the trend a pause. How does your model account for that pause?

• Jesus Wept
Sheldon, did it ever occur to you that someone who has spent a decade studying a subject to get a PhD and then, say, thirty or forty years delving ever deeper into the subject, producing new and original insights in the field, arguing with his fellow experts and convincing them might, just might have a better understanding of the subject than someone who liked science in high school and thinks Excel is a scientific computing platform?

I have another scientific exercise for you–look up the work of psychologists David Dunning and Justin Kruger:

9. Sheldon Walker

You have not expressed any desire to see my evidence.

I believe that you do not want there to be a slowdown, and so you refuse to look at any evidence which might support a slowdown.

That is your decision. Good luck with that.

[Response: I looked at your evidence, in great detail. I proved it was wrong, and why. You refuse to listen. That’s why you’re not a skeptic, you’re a denier.

Worse yet, you lamely accuse me of doing exactly what it is you’ve made your modus operandi.]

10. Philippe Chantreau

I wonder how that “slowdown” turns out once the years after 2013 are included in the analysis.

11. Lauri

Also climate scientists are sometimes mislead in seeing changes. A timely example is in http://www.nature.com/articles/ncomms13428/figures/1.
They use a 13-year trend and claim that there is a “structural change” in global atmospheric CO2 growth rates. They claim a slowing in the last 13 years of their data that ends on 2014.

It looks so pathetic when you place the 13-year period in context. The 13-year trends per year (of ppm values) for years _ending_ on 2006 to 2015 are:
2006 0.029395604
2007 0.029395604
2008 0.026208791
2009 -0.011428571
2010 0.00489011
2011 0.029945055
2012 0.025274725
2013 0.016758242
2014 0.00456044
2015 0.045989011

And now Betts et al. have chosen 2014 as their ending year, with a trend close to 0. By choosing some other ending year, one gets completely different results. I can’t understand what the point of this research is. And now the newspapers globally report that CO2 growth has stalled!

• Lauri

In the first row of the last paragraph, it should read ‘Keenan et al.’. Sorry for this misunderstanding.

12. Sheldon Walker

Tamino,

Do you think that calling me a denier will make me accept what you say? It actually has the opposite effect.

[Response: I’m no longer trying to persuade you. I tried that, but you won’t listen to reason. Now I’m just calling you what you are: a denier.]

I know that you will find this hard to believe, but there are methods of analysing data which are not purely statistical.

[Response: I know you find this hard to believe, but statistics is sound, when done right (you didn’t). When it shows you wrong, you should believe it.]

I am not very familiar with the Chow test, but I understand what it is trying to achieve. The p-value is the probability of obtaining a broken trend at least as extreme as the model being tested, assuming that the null hypothesis is true. The null hypothesis in this case is that the 2 trends have the same slope. The p-value of 0.18 means that there is an 18% chance that a broken trend at least as extreme as the model being tested, will be found purely by chance. A common statistical significance level is 0.05, and using this level the null hypothesis would be rejected if the p-value was less than or equal to 0.05.

In this case the p-value is greater than 0.05, and the null hypothesis would not be rejected. The conclusion would be that the 2 trends have the same slope.

But there is something that I find unsatisfying about that result.

[Response: We know. What you find unsatisfying is that it contradicts your denial narrative.]

The temperature rate data is HIGHLY variable.

[Response: The temperature rate *estimate* is highly variable. That’s because the noise level is high. You might consider that to be a “clue.”

I’m no longer saying these things to try to persuade you, you simply won’t listen. I’m saying them so that other readers, not savvy about statistics, will actually *get* a clue.]

Normally when people talk about the global warming rate, they are talking about 1 or 2 degrees Celsius per century. 3 degrees Celsius per century would be rapid warming. But on a month to month basis, the warming rate varies from -660 to +576 degrees Celsius per century (calculated from the Gistemp monthly temperature series).

[Response: No it doesn’t. That’s ridiculous. The “global warming rate” refers to the trend, not the noise. You continue to attach yourself to rates influenced by noise, and treat them as though they were real. You are hopeless.]

In the short term it is almost impossible to identify a trend of 1 to 3 degrees Celsius per century, among variability of -660 to +576 degrees Celsius per century. You need to look at a long trend before you can positively identify the trend of 1 to 3 degrees Celsius per century. That is why warming trends can easily be detected, because the warming has been going on since about 1975 (over 40 years). However, my slowdown trend is only `11 years long. Guess how hard that is to detect. It is highly unlikely to be statistically significant.

So the moral of the story is, warming trends are easy to detect, slowdowns, pauses, and cooling are hard to detect. Warming trends will often be statistically significant. Slowdowns, pauses, and cooling will usually NOT be statistically significant.

Note: NOT statistically significant does NOT mean doesn’t exist. It means couldn’t be detected statistically. It might not exist, but it might.

[Response: That’s the first thing we agree on.]

Does that mean that the slowdown didn’t happen? No. It means that the standard statistical tests can’t say for sure whether it happened or not. The result will usually come out as statistically insignificant, which global warming believers will interpret as meaning that the slowdown didn’t happen. But it really means that it either didn’t happen, OR THAT IT DID HAPPEN, BUT COULDN”T BE DETECTED BECAUSE OF THE HUGE VARIABILITY. The same applies to the Chow test, the broken trend is lost in the huge variability, and appears to deliver the result that the 2 trends are the same.

[Response: What it really means, which you refuse to accept, is that there’s no *evidence* for a slowdown. Yet you insist it happened.]

Luckily, there are other ways of looking at the data. My method of analysing the temperature data is not limited to finding a single trend. I analyse ALL possible trends. That means that when I look at a trend, I also look at all possible sub-trends. And all possible sub-trends of the sub-trends, etc, etc.

So I end up knowing how ‘consistent’ a trend is. For example, I can tell if a flat trend is made up of a downward sloping first half, and an upward sloping second half, or if it consistently flat across the whole trend. The slowdown trend that I sent Tamino is fairly consistently flat.

That is enough for now. I will give Tamino the chance to reply.

[Response: I have replied and replied, you remain intransigent. We’re done with you.]

• skeptictmac57

“Do you think that calling me a denier will make me accept what you say? It actually has the opposite effect.”

There! Right there! You have betrayed a lack of critical thinking about your own bias. It should matter not one whit what someone calls you as to whether or not a proposition is true.

• Chris O'Neill

I know that you will find this hard to believe, but there are methods of analysing data which are not purely statistical.

Sheldon, I know that you will find this hard to believe but your point is a straw man argument.

Note: NOT statistically significant does NOT mean doesn’t exist.

So now you’ve moved the goalposts from claiming a statistically significant pause to arguing for the importance of a non-statistically significant pause.

Sheldon, the above are just 2 of the logical fallacies you have used. Try reading this (mildly amusing) book to see what others you have used. (Start with Straw Man. You can google “moving the goalposts” to learn about that.) It’s remotely possible you might learn something about yourself:

Regardless of your skills with Excel etc, by far your greatest skill is the use of logical fallacies.

• Jim Eager

No Chris, Sheldon’s greatest skill is self-delusion. The use of logical fallacies is one of the lesser skills he employs.

• Chris O'Neill

It (self-delusion) is not exactly a skill that works on anyone else.

• Sheldon Walker

@Chris O’Neill

Hi Chris, you said, “So now you’ve moved the goalposts from claiming a statistically significant pause to arguing for the importance of a non-statistically significant pause.”

I have to be careful what I say Chris, because otherwise Tamino will give me an ear-bashing.

Words are important, I used the term slowdown when talking about my trend, I never said “Pause”. If you start throwing the “Pause” word around, then you will definitely be called a “Denier”, and I am so frightened of that word.

If you look carefully at what I have said, I have NEVER claimed that the slowdown trend was statistically significant. It is NOT statistically significant, it has a p-value of about 0.495607015

What I claimed was that the slowdown trend was statistically significantly different from a number of warming trends that I listed. That is a totally different thing than what you claim that a said.

All I ever wanted Chris, was for people to admit that there had been a little bit of an insignificant slowdown. But REAL Alarmists are not allowed to admit that. Just like REAL men are not allowed to cry, or like flowers.

The world is a very sad place, but at least we don’t have to worry about being cold.

• Chris O'Neill

I have NEVER claimed that the slowdown trend was statistically significant.

Sheldon Walker 2016/11/7:

There is a statistically significant slowdown in the GISTEMP yearly temperature series. It begins in 2002, ends in 2013, and has a length of 11 years.

Not content with logical fallacies, you now resort to outright lying.

Absolutely appalling behaviour Sheldon.

• Sheldon Walker,
You appear to dislike being called a ‘denier’ but in your case I feel the term ‘deluded’ would be more appropriate.
You tell us you do not understand the statistics that would be used to identify changes in trend so, rather than attempt to uderstand it and apply it, from your position of ignornce, you decide to invent your own statistical test.
We do perhaps at this point have something more positive to say about you. It is true that a lack of statistical significance does not mean that a phenomenon may be present and appearing like the other variance that prevents more accurate statistical analysis by constraining statistical inaccuracies. But this does not give license to then ignore statistics. MLR for instance shows that the phenomena that cause the variance (Sol, Vol & ENSO) also causes the dip you brand as “slowdown”. Consider the analogy – You insist there are fairies at the bottom of the garden but that they look exactly like the bricks in the backgarden wall. MLR shows that if they are fairies, they also come from the brick factory.

As for your “slowdown,” it s not entirely illigitimate to plot trends over short intervals and see how moving the interval through time changes those trends. Ignoring the statistical significance of the trends is not something that can be entirely dodged but let us make that dodge here.
You are saying with an 11 years interval your “slowdown” has significance because “The slowdown trend that I sent Tamino is fairly consistently flat.” I am not aparty to what you sent Tamino but if I plot 11-year trends I do see a period 2002-13 with a consistently low trend for a dozen months or so.
How significant is this period of low trend?
If the varying level of trend is examined 1975 to 2000 (& I see an argument based on the physical situation to support using such periods) a mean & sd can be calculated and used to see how significant this alleged “slowdown” actually is within the trend data. It sits 2.18 sd below the mean. But before trumpets blare and flags wave, there is also another diviation of trend that cannot be ignored. The period 1992-2002 shows an “anti-slowdown” that sits 2.56 above the mean. And this “anti-slowdown” was used to calculate the mean & sd so the 2,56 is a gross underestimation. With “slowdown”and “anti-slowdown” a better estimate would be to use 1975 to 1990 for our mean/sd calculations. This would put the “slowdown” at 2.39 sd below the mean and the “anti-slowdown” at 4.24 above.

So Sheldon Walker, if you continue to insist that there is a significant “slowdown” (statistically present or not), why would anyone not consider you deluded given your precious “slowdown” is show by MLR to be nothing special and that it is preceeded by an “anti-slowdown” of far greater significant, an “anti-slowdown” you fail even to notice? Why should we not consider you deluded?

• Sheldon Walker

@Chris O’Neill

You are correct, and I did not remember correctly what I had written.

Believe it or not, I am human. I find some statistical terminology awkward and cumbersome. “Staistically significant” and “statistically significantly different” are easy to confuse.

But I like to admit my mistakes. I was wrong.

Before you send the lynch mob around to my house, perhaps they should visit jgnfld’s house. He just admitted that he put an extra zero in his ‘R’ code.

Also, YOU accused me of using the word “Pause” when I used “slowdown”. “Pause” is a very emotionally loaded word. Were you trying to get me in trouble?

The lynch mob is going to be very busy tonight!

• jgnfld

Sorry Sheldon: I did not “admit” it, I stated it. And I also showed it changed nothing.

You never answered. Which lotto ticket is more likely to win a 7-49 game:

21-22-23-24-25-26-27-28 or 2-13-21-24-37-40-43? If you understand the correct answer here you will also understand why you claims about “patterns” are totally bogus.

• Sheldon Walker

@jgnfld

You said, “Sorry Sheldon: I did not “admit” it, I stated it. And I also showed it changed nothing.”

My dictionary defines “admit” to mean “Declare to be true or admit the existence or reality or truth of”.

If you didn’t admit it, then are you trying to tell me that you are lying about it?

In your lotto question, the 2 tickets have an equal chance of winning. If all of the balls have an equal chance of being drawn, then it doesn’t matter what symbol is on the ball. Patterns like 21-22-23-24-25-26-27-28 are often “meaningful” to humans, but the natural world often does not share the same view. Many humans do not understand “randomness” very well.

This is a story that I like. A company asked a manufacturer to build a random number generator, to generate random numbers between zero and 100. When the company got the machine, they turned it on and the first random number generated was zero. They sent it back to the manufacturer and complained that it wasn’t random.

When they go it back, they tested it by generating 1000 random numbers. They sent it back to the manufacturer with the complaint that it still wasn’t random, it didn’t generate any zeros in the 1000 test numbers.

Sometimes you just can’t win.

• Chris O'Neill

the first random number generated was zero. They sent it back to the manufacturer and complained that it wasn’t random.

This is like Sheldon complaining about patterns in Tamino’s coin toss because it first came up tails.

• Chris O'Neill

visit jgnfld’s house. He just admitted that he put an extra zero in his ‘R’ code

OK. So now you use the Tu Quoque (you too) logical fallacy (also known as Appeal to Hypocrisy).

At least we can all agree that when it comes to matters of statistical significance you are incompetent and what you say on the subject is rubbish.

Sadly, you use a series of logical fallacies in an vain attempt to cover up your incompetence.

• jgnfld

Sheldon: You have now “admitted” that you understand why you are wrong at some level.

• JCH

I still do not get his end game. Among the large number of climate scientists who have used “slowdown/hiatus/pause”, I can’t see any change at all in their conclusions about global warming. If anything, the growing subset who think natural variation masked a great deal of greenhouse warming between ~1985 and ~2013, are speculating we are on the cusp of an acceleration in warming… another what may have been referred to here as an antipause.

• Sheldon said:

“If you didn’t admit it, then are you trying to tell me that you are lying about it?”

Just sayin’, but that’s classic trolling in a nutshell.

• JCH

dictionary:

1. confess to be true or to be the case, typically with reluctance.

Before you send the lynch mob around to my house, perhaps they should visit jgnfld’s house. He just admitted that he put an extra zero in his ‘R’ code. – SW

No reluctance, and no confession. He found an error and immediately published an explanation and a correction.

What is your end game? That warming has not resumed? That it is all El Niño?

• Bernard J.

…I used the term slowdown…

..an insignificant slowdown…

Oxymoron.

So many people speak of an “insignificant trend” and don’t understand that they’re effectively talking about optical illusions rather than anything that constitutes an actual “trend”. A trend in the current context is effectively a phenomenon that emerges from randomness. What you’re asserting is that you can detect non-random randomness. Hello…

This is an important point Sheldon Walker. Your excursions hree (and elsewhere) could very well have been the inspiration for Miguel de Cervantes Saavedra’s windmill-jousting protagonist in El ingenioso hidalgo don Quijote de la Mancha.

OP:

Let’s not “pile on” Sheldon Walker for making these mistakes. As I’ve said, others do too, including professionals.

Well, SW has demonstrated that one need not be a professional statistician to make these mistakes. Are we allowed to pile on yet?

SW:

The result will usually come out as statistically insignificant, which global warming believers will interpret as meaning that the slowdown didn’t happen.

SW seems to have little grasp of basic climate physics, and what he calls “statistics” more closely resembles numerology. He confuses climate with weather, and he can’t distinguish between trend and noise. All he knows is what the temperature record “looks like” to him, and thinks “global warming” means every year should be warmer than the previous one. He doesn’t understand that CO2 is thebiggest control knob for GMST over periods of 30 years or more, or that while other factors also affect GMST over shorter periods, they cancel each other out eventually.

He appears intelligent enough that if he was willing to put the necessary time in, he’d realize that acceptance of AGW is a matter of evidence rather than “belief”. Unfortunately, as with so many AGW-deniers, the very idea seems to make him so uncomfortable he’d be unable to accept the scientific case for it even if he could understand it 8^(.

• Sheldon,
We probably won’t be hearing from you again, but your performance here has been so clueless that it borders on cute. There is something precious about someone who knows nothing about data, statistics (which is just the quantitative analysis of data) or climate science asserting that they’re smarter than all the experts. It almost makes me want to give you a big hug and tell you it will be all right. Unfortunately, after the recent US election, I could only do so if I were willing to lie.

• P.S. Sheldon, let’s play poker some time.

13. JCH

…The result will usually come out as statistically insignificant, which global warming believers will interpret as meaning that the slowdown didn’t happen. – SW

From I can see the vast majority of the scientists who have researched and published on the possible causes of the “slowdown” in warming likely have no substantive disagreement with Tamino about the scientific reality of AGW.

There is no belief involved.

Examples: Trenberth; England; Xie; Zhou; Loeb; Karl et al; Fyfe; Hawkins: etc.

I don’t see any point to your effort at all.

14. Hyperactive Hydrologist

Sheldon,

Climate is usually defined as average weather over a 30 year period. Try limiting your trend analysis to periods of 30 years or greater and see whether you can detect any slow down.

If you measure trends for shorter time periods it is highly likely that all you are measuring are trends in natural variability.

15. Sheldon Walker

For all of the people here who believe that you can’t judge things (like temperature trends) by how they look.

If it looks like a duck, and it walks like a duck, and it quacks like a duck, then it probably IS a duck. (unless it is a statistically insignificant duck)

• Graeme Hird

• Jim Eager

“If it looks like a duck, and it walks like a duck, and it quacks like a duck, then it probably IS a witch” more accurately reflects Sheldon’s thought process.

SW:

If it looks like a duck, and it walks like a duck, and it quacks like a duck, then it probably IS a duck.

That only applies to ducks. We’re talking about a statistical phenomenon.

• Chris O'Neill

unless it is a statistically insignificant duck

So it’s not a duck then.

• jgnfld

What Sheldon just cannot understand is that finding a duck really isn’t “statistically significant” if there are 97,000 of them scattered about the pond. What would be _actually_ be statistically significant would be NOT to find a duck.

I truly pity poor Sheldon’s stats teacher. Not that there’s much evidence he ever had one.

16. Chris O'Neill

If it looks like a

strawman argument

, and it walks like a

strawman argument

, and it quacks like a

strawman argument

, then it probably IS a

strawman argument.

17. jgnfld

In the interests of truth and fairness to SW, I just noticed that I added an extra zero to my R code and was sampling 970,000 coin flip sets rather than 97,000. The longest coin toss set of all heads in that case is a set of 19 heads for set #38154 rather than the 20 for the larger sample.

2^19 is about one in half a million which is still highly “statistically significant”!

Corrected script:

set.seed(100)

flips <- matrix(sample(c("H", "T"), 97000*19, TRUE),19)

which(apply(flips,2,function(z) (length(unique(z))== 1 && z[1]=="H")))

[1] 38154

18. Would somebody please post a graphic of the temperature record since the 19th century that doesn’t stop before 2016 so Sheldon can look at his “duck”?
https://www.sciencedaily.com/releases/2016/11/161114113539.htm

• Susan Anderson,
I’m not sure that even if ‘Sheldon can look at his “duck”’ he would be able to appreciate what it is he is looking at, or hearing, wack wack. Here is what Sheldon Walker wrote as his concluding remarks to his Wattsupian thesis back in February.

“A final word about the future. The Pause has been weakened by the 2015 El Nino. That does not mean that it never existed. Anybody gloating over the Pause becoming weaker, should bear in mind that El Nino’s do not last forever. Once the El Nino’s temperature increase has gone, the Pause will probably strengthen. A La Nina may also give the Pause a boost. Do not underestimate the Pause, it may surprise you yet”

I did actually dash off a graphic to illustrate Sheldon Walker’s stupidity but interruptions have delayed its posting here. It shows uncontroversially the GISTEMP data which is nothing but linear since 1970 (so far), how it does have humps & bumps that allow Walker’s ‘slowdown/pause’ to be cherry-picked, and how the change in trend is small and quite insignificant. The graphic is here (usually 2 clicks to ‘download your attachment’). The trace of 48-month trends in the bottom chart shows the “anti-slowdown” for the years 1992-2004 which relies on the same high temperature data points that Walker’s precious 2002-13 “sowdown” uses. Such as it is, the “anti-slowdown” is the more significant feature.

19. This should do it, with a little luck:

Or: