Best Case Scenario

Chip Knappenberger has posted about some papers, including Santer et al. (2011, Separating Signal and Noise in Atmospheric Temperature Changes: The Importance of Timescale. Journal of Geophysical Research, doi:10.1029/2011JD016263) which proves one thing: that Knappenberger doesn’t get it. In more than one way.


He doesn’t understand the Santer et al. paper, he doesn’t understand the implications of Foster & Rahmstorf (2011), he fails to comprehend the value and validity of computer models, and he is clueless about the danger posed by global warming — even in a “best case” scenario.

The reasoning to support his arguments is this:


What makes the Foster and Rahmstorf work particularly encouraging for lukewarmers is that the authors find that for periods of 30 years or so, the removal of natural variability makes little difference on the magnitude of the observed trend in the lower atmosphere.

However, thinking back upon the results from Santer et al., the same is probably is not entirely true for all of the climate model runs for the 1979-2010 time period. Almost certainly, the combination of random variability has added some amount of noise to the trend distribution even at time frames of 30 years or so.

What this means, is that if the modeled temperatures were also stripped of their natural variability, then the 95% range of uncertainty (the yellow area depicted in Fig. 2) would contract inwards towards the model mean (green line). The net effect of which would be to make the observed trends (red and blue lines in Fig. 2) over the past 30 years or so lie even closer to (if not completely outside of) the lower bound of the 95% confidence range from the model simulations. Such a result further weakens our confidence in the models and further strengthens our confidence that future warming may well proceed at a modest rate, somewhat similar to that characteristic of the last three decades.

This is some of the most ludicrous nonsense ever written. What Knappenberger is really saying is that since natural variability contributes to uncertainty, we can “imagine it away” — even if we don’t know what it is! This is nothing more or less than shrinking the confidence interval based on wishful thinking.

Sure, if you account for natural variation you can shrink the error range, but the mean itself will also change, and you don’t know where the confidence interval will end up unless you actually know the natural variation. Claiming that F&R2011 demonstrates that the mean will not change, and that we can safely conclude which way the confidence interval will change, is nothing more than wishful thinking. It’s utter folly to extrapolate from “we could account for natural variation if we knew what it was” to “we can therefore shrink the error range even though we don’t know what the natural variation is.” Such a claim calls to mind the classic phrase “not even wrong.”

And by the way — Knappenberger is also flat-out wrong about the yellow area in the graph from Santer et al. being the “95% range of uncertainty.” It’s the range of model results from 5% to 95%, which leaves 5% on both the high and low ends, which makes it a 90% range, not a 95% range — the actual 95% range would extend from 2.5% to 97.5% of the observed results.

The point of Knappenberger’s nonsense logic is to claim this:


But what’s worse is that a model/observation disparity could indicate that the climate models are not faithfully reproducing reality, which would mean that they are not particularly valuable as predictive tools.

My conclusion (which, is different from that of the authors) based upon the research presented by Santer et al.—that the models are on the verge of failing—is further strengthened by the results of another paper published in 2011 by Foster and Rahmstorf.

The failure of Knappenberger’s logic boggles the mind. Suppose we were talking about weather models rather than climate models. There are certainly model/observation disparities, all the time, and we could easily identify something the models don’t do particularly well, focus attention on that to the exclusion of all else, and by Knappenberger’s logic conclude that “they are not particularly valuable as predictive tools.” But every weather forecaster knows that in spite of their imperfections (which are legion), computer models are by far the best predictive tools we’ve got. My guess is, that even Anthony Watts wouldn’t deny that.

Climate models, despite their imperfections (which are legion), are also the best predictive tools we’ve got. And in spite of the fact that they don’t give especially good answers to some things, they also do give especially good answers to other things. Calling them “on the verge of failing” tells us nothing about reality, but quite a lot about Chip Knappenberger’s preconception.

Here, for instance, is a comparison of surface temperature (not tropospheric temperature) from AR4 model runs simulating the 20th century, to GISS temperature data:

Not only is the GISS temperature well within the envelope of model results, it’s quite close to the multi-model mean — which needn’t be the case because reality is only one “realization” of the climate system. In fact GISS temperature is stunningly close to the multi-model mean, as is shown by the difference between them:

The only visually notable discrepancy is from 1937 to 1945, the period during which a change in the way sea-surface temperatures were measured may have contaminated the observed temperature record. We all look forward to new estimates of sea surface temperature which are designed to account for this data discrepancy. If the revised 1937-1945 data are in even better accord with model results (which I expect), it would be a spectacular endorsement of climate models — and yet another case in which the reason for model/observation disparity was that the models were right, the observed data were faulty. But I suspect even that won’t make Chip Knappenberger budge from his belief that the models are “on the verge of failing.”

The real heart of Knappenberger’s post, and perhaps the most foolish failure of his reasoning, is this:


So what I have documented is a collection of observations and analyses that together is telling a story of relatively modest climate changes to come. Not that temperatures won’t rise at all over the course of this century, but rather than our climate becoming extremely toasty, it looks like we’ll have to settle (thankfully) for it becoming only lukewarm.

We’ve already warmed (at the surface) by about 0.9 deg.C since 1900. Earth is currently warming at 1.7 deg.C/century. Over the next century it’s extremely likely that we’ll warm even faster. But even if we continue to warm at the present rate, that will add another 1.7 deg.C to global average surface temperature, making a total 2.6 deg.C. I doubt that will be the case, in fact I consider the probability to be extremely low — less than 5% — but it represents the best case we can realistically hope for. The idea that this is “lukewarm” and that it won’t spell major disaster for humanity, is ludicrous. The global temperature change from full-glacial to full-interglacial conditions is about 5 deg.C. If you really believe that heating up the planet by half the difference of a full glacial cycle would be “relatively modest climate changes,” then you’ve got no damn business influencing climate policy.

I think such a change is overwhelmingly likely to bring disastrous changes for human civilization, especially for the availability of FOOD and WATER, and it doesn’t get any more basic than that. And that’s the best we can hope for — it could be far, far worse.

53 responses to “Best Case Scenario

  1. Tamino, you make some good points about why we should listen to models even though they are imperfect. “All models are wrong, but some are useful”, and there is no method for prediction that’s been more successful than GCMs.

    I just published a post that’s been in the works for a long time, sort of a primer on how climate models work in plain English: http://climatesight.org/2012/01/20/how-do-climate-models-work/

  2. “We all look forward to new estimates of sea surface temperature which are designed to account for this data discrepancy.”
    Thought this was addressed here in July:
    http://www.realclimate.org/index.php/archives/2011/07/revisiting-historical-ocean-surface-temperatures/
    Realclimate post indicates adjustments include reducing sst by up to 0.1C in ’39 and ’41, and increasing sst by up to about 0.2C in ’46 and ’47. Applying these adjustments to your data set (I dont think GISS uses the new HadSST3) does little to address the notable visual discrepancy.

    [Response: Bullshit. It reduces the discrepancy, just as I expected.

    Even if it didn’t, the current GISS temperature estimate is well within the model envelope and damn close to the multi-model mean — which needn’t be the case since reality is only one “realization” of the climate system. Your claim that models don’t reproduce early 20th-century warming was wrong, still is wrong, and yet you stubbornly refuse to stop.]

    • Or you could read deeper into the Real Climate post in question, and find this analysis:

      “There will also be improvements in model/data comparisons through the 1940s, where the divergence from the multi-model mean is quite large. There will still be a divergence, but the magnitude will be less, and the observations will be more within the envelope of internal variability in the models.”

      A word on B’s last couple of comments–a word with Horatio in mind:

      “rebunkmates.”

      (H/t to Hank R., too, of course, for the original coinage.)

  3. Gavin's Pussycat

    Chip plays once again, for the umpteenth time, the popular fallacy that is also discussed extensively by Edward Jaynes:

    Click to access cc05e.pdf

    who writes

    “Thus, verifi cation of the Leverrier–Adams prediction might elevate the Newtonian theory to certainty, or it might have no effect at all on its plausibility. It depends entirely on this: Against which specifi c alternatives are we testing Newton’s theory?”

    and

    “In view of this, working scientists will note with dismay that statisticians have developed ad hoc criteria for accepting or rejecting theories (Chi–squared test, etc.) which make no reference to any alternatives. A practical difficulty of this was pointed out by Jeffreys (1939); there is not the slightest use in rejecting any hypothesis H_0 unless we can do it in favor of some de finite alternative H_1 which better fits the facts.”

    This is obvious! You test an alternative hypothesis against a null hypothesis, not against “nothing”. You must have a minimally credible null, not a so-called “silly null” — what is it, Chip?

  4. Tamino, your presumption of this being a case of misunderstanding on Chip’s part is kind of you, but also misplaced.

    Chip knows E X A C T L Y what he is saying…and the ramifications of it.

    Let spades be called spades.

  5. Tamino,

    My guess from Foster&Rahmstorf’s Table 3, is that the model mean surface warming would not be impacted very much if you removed the “natural” variability over the period 1979-2010 as the influence of the volcanic events seems to be largely much offset by the TSI—the timing of both being prescribed in many of the model runs. And I’ll assume that over a large collection of model runs, ENSO would pretty much offset itself over ~30-yr periods. As for the LT, F&R Table 3 indicates that the model mean may drop a bit if natural variability were removed as the warming induced by the AOD is about twice as large as the TSI cooling. However, absent AOD and TSI influences, the model mean of the A1B runs is 0.25C/dec during the first 2 decades of the 21st century (from work that I have been involved with)—pretty close to what Santer et al. reported for 1979-2010, so I doubt the reduction of the mean LT trend would be very much.

    I don’t know how much of the distribution of model trends about the mean is caused by natural variability in the models and how much is caused by model differences—although I would imagine that the former dominates over shorter intervals (where the crossover point it I am not sure–more or less than 30years?). In any case, it is hard for me to imagine a situation in which if you were to apply the F&R technique to remove natural variability to each individual model run and then plot the distribution of the collection of model trends for different lengths, that the distribution would not be tighter than that produced from the raw runs (like in Santer et al.). Obviously, the spread of the distribution would shrink more over shorter intervals, but I still imagine–as I supposed in my MasterResource post—that the influence would be noticeable on 30-yr trends. But it seems that you think I am wrong. I’d be interested in evidence.

    -Chip Knappenberger

    [Response: Are you as dishonest as you seem, or have you just lost your mind?

    You’re the one who claimed that you can shrink the error bars by removing natural variability even though you don’t know what it is or how much of the model variability is due to natural variation vs model differences — you LITERALLY just decided to “wish away” the uncertainty. You made this claim with absolutely no evidential support whatsoever, and you have the goddamn gall to ask ME for evidence?

    Some might interpret such brazen conduct as great big balls of steel. Others might call it cowardice, since you were caught in an obvious, undeniable, idiotic rookie mistake but you’re just not man enough to admit it.]

  6. Tamino,

    OK, I’ve gone and made an initial investigation of my general idea. What I find is that, based on the collection of A1B models runs for the first 20 years of the 21st century (which I had handy), that the size of the “natural” variability influence on the standard deviation of the model trends is about the same as the size of the intermodal differences for a trend length of about 14 years. But a big caveat with the A1B runs is that there are no volcanos and for the vast majority of models, no solar variability—so the influence of natural variability is underplayed. I don’t have enough data on hand at the moment to investigate the proportion of the individual contributions to the standard deviation of trends greater than 15 years. However, as it is somewhat standard to assume the contributions to the overall standard deviation of the trends from intermodel differences and natural variability would add in quadrature—it seems safe to say that at least some contribution to widening the distribution is being made by natural variability at 30 year time spans—which was my contention at MasterResource. Whether it is large enough to be of practical significance, I guess, remains outstanding, although I am still of the opinion that it is.

    -Chip Knappenberger

    [Response: You *still* insist that even though you don’t know what the natural variability is, you can wish it away to make the confidence interval exclude what you don’t like.]

  7. Knappenberger also thinks it’s perfectly okay to delete the data he doesn’t like from figures in published papers.
    http://www.skepticalscience.com/patrick-michaels-serial-deleter-of-inconvenient-data.html#72011

    Thus when I read this article of his, it didn’t surprise me, but I had a feeling the authors of the referenced papers weren’t going to be happy about their research being misrepresented as supporting ‘lukewarmerism’.

  8. The $64 question is: “How much more AGW will it take to produce a serious disruption of agriculture or aquatic ecosystems?”

    The models may have us thinking in terms of degrees Kelvin of global warming, when in fact, tiny fractions of a degree are enough to disrupt ecological systems (that we depend on). And, the way the models are structured may cause us to forget to allow for significant follow-on feed back effects.

    In short, you want us to give a passing grade to models that do not answer the important questions. We are basing public policy on models that do not include carbon feed back and ice dynamics. That is like planning a household budget that does not include rent or utilities. That is OK for a fifth grader learning to budget, but it is not acceptable for an adult living in the real world.

    This is the real world, and Mother Nature has teeth.

    [Response:

    Of course the models are flawed — anyone who claims I (or anyone else) said otherwise is just setting up a strawman. As for a “passing grade” — don’t put words in my mouth, or sneak in the implication that they’re a failure. Computer models are the best predictive tool we’ve got, nothing more nor less. Ignoring them is far more foolish that relying on them too much.

    The unknowns you mention make the situation more dangerous, not less so. But those — like Chip Knappenberger — who argue for model failure do so to argue that the situation is less dangerous, not more so.]

    • We’re making public policy on the basis of models? That’s news to me. I had the impression that in the US we make public policy based–I was going to say “on what Rush Limbaugh had for breakfast,” but I think it may actually be “on what the Koch brothers and their management team decided.”

  9. Tamino,

    Further evidence for my supposition can be found in Santer et al (2011) Figure 6 panels A and B. In panel 6A, the yellow range (as you correctly point out, is the 5-95% range based on the forced runs). In panel 6B, is depicted the standard deviation of the model distribution of trends from control runs—presumably this represents only variability from natural variability and not intermodal differences in anthropogenic forcing and the response to it. To my eyes, the standard deviation value for 32-yr trends is about 0.03°C in Figure 6B. By the same measure, it looks to me like the 5-95% range (which is 1.65 standard deviations) in Panel 6A is about +/- 0.1°C. So, the standard deviation of the 32-yr forced trends is 0.061°C. Assuming that this is the combination in quadrature from the natural variability (.03°C) and the intermodal variability, then the intermodal standard deviation works out to about 0.053°C. Or, roughly 25% of the width of the 5-95% range for the 32-yr model trend variability in Figure 6A is contributed by natural variability. Thus, if the natural variability were removed from the model runs giving rise to the yellow area Figure 6A, the observation trends would lie very near or even below the lower bound—again, in direct accordance to my supposition at MasterResource. I guess I didn’t need you to provide me with any evidence after all.

    -Chip Knappenberger

    [Response: You *still* insist that even though you don’t know what the natural variability is, you can wish it away to make the confidence interval exclude what you don’t like.

    You haven’t got a leg to stand on. You remind me of the “black knight” from “Monty Python and the Holy Grail.” Come back here! I’ll bite your confidence interval off!

    Apparently you don’t even realize what an idiot you’re making of yourself. Everybody else does.]

  10. “The only visually notable discrepancy is from 1937 to 1945, the period during which a change in the way sea-surface temperatures were measured may have contaminated the observed temperature record.”

    In this graph:
    http://www.woodfortrees.org/plot/hadsst2gl/mean:120/plot/crutem3vgl/mean:120/plot/crutem3vnh/mean:120

    is evident that the anomalous 1937-1945 warming isn’t just an oceanic warming episode , but also a significant LAND warming episoded

    “We all look forward to new estimates of sea surface temperature which are designed to account for this data discrepancy. If the revised 1937-1945 data are in even better accord with model results (which I expect), it would be a spectacular endorsement of climate models”

    New oceanic data can remove the 1937-1945 warming anomaly from the data only if the true temperatures were cooling, to balance the land warming.

    This would need a major change in the data that is unlikely.

    I suspect that what is wrong is the modeled forcings.

    In particular, the period 1937-1945 is roughly contemporaneus with World War II. If huge industrial areas are destroyed (like in WWII) the cooling sulfate emissions (from burning coal) will be reduced, while the warming black carbon emissions (from burning cities during the war) will increase.

    What do you think?

    [Response: I think the graph you point to does not show that the *anomalous* (i.e., departure from multi-model mean) warmth is still present in land-surface temperature. In fact it shows that during the 1937-1945 period, the sea surface temperature record (HadSST2) showed greater warming than the land record (CRUTEM3). If the sea surface warming is reduced, so will be the global temperature estimate, as will the difference between estimated temperature and model results.

    I do agree there’s uncertainty about forcings during the entire 20th century, and the WWII period may be especially vulnerable to that uncertainty.]

  11. Tamino,

    Why do you keep saying I don’t know what natural variability is? In Santer et al. (2011) they are looking at signal to noise ratios and they use the “control runs from the CMIP-3 multi model archive for our estimates of climate noise.” So the influence of the climate noise is pretty well known/shown in their Figure 6B and Figure 4 (particularly Fig4C—the 30-yr trends). The variability in their Figure6A is a combination of climate noise and inter-model differences. As Santer et al. Figure 6B shows, the climate noise in the models is not negligible, and I made a pretty decent estimate of it in my previous post. So I think I have a pretty good handle on the random part of it (assuming that Santer et al. do). As for that part of the noise which is generated from the prescribed timing of natural events, like volcanos and TSI, are you suggesting that if I were to include those influences in my estimates of natural variability in the models, that it would lower the influence of natural variability on the LT trends over the 1979-2010 time period from the level in the control runs? If that is not what you are suggesting, then I am at a loss as to why the variability of the trends in the control runs is not a decent measure of the influence of natural variability on the trends in the models.

    -Chip Knappenberger

    [Response: We all know what natural variability is. But you don’t know what THE natural variability is for the model runs.

    Why don’t you just admit that you can’t “imagine away” natural variation without knowing what it is? My theory: obstinate folly.]

  12. Tamino,

    Santer et al. 2011 use the control runs to estimate the magnitude of natural variability on the distribution of model trends. I am using their numbers (or eye-ball estimates thereof). What am I missing?

    -Chip

    [Response: A clue.]

    • Dikran Marsupial

      Chip, Imagine that the true (but unknown) forced trend is 1, but the effects of unforced variability mean that the observed climate can be regarded as a sample from a normal distribution with (unknown) variance 1, i.e. X ~ N(mu = 1, sigma^2 = 1).

      O.K., so we can’t measure either mu, nor sigma, so we use our understanding of the physics to build a model of the climate, which gives us an estimate mu’ and sigma’ (so that a model run Y ~ N(mu’, sigma’)).

      We then have an observation x (neglecting its measurement uncertainty) which is a sample from N(mu,sigma).

      We can test if this observation is consistent with the models by testing whether x is plausibly a sample from N(mu’, sigma’), which we can do by simply seeing if x lies within the spread of the model runs (e.g. it lies within the 5th and 95th centiles). We do this and find that it is.

      Now you want to see if you can perform the test after having removed the effects of natural variability. However we don’t actually know mu or sigma, we only know mu’ and sigma’ (which are estimated from our model). Now you can assume that sigma = sigma’, however in that case you are performing a test of whether the model is correct that is based on an assumption that is only valid IFF the model actually is correct.

      The test that the climatologists actually perform doesn’t require this assumption, which is why it is the correct choice of test.

    • Dikran Marsupial

      Put more straightforwardly, we have one observation, x, from say N(mu, sigma^2), and we have a set of model runs that are drawn from a distribution N(mu’, sigma’^2). How do you test for a statistically significant difference between mu and mu’, given that you only know mu’ and sigma’ and the only thing you know about mu and sigma is what you can infer from the single observation x (i.e nothing much).

  13. As a rookie i had come up with exactly the same understanding as Chip prior to reading this post, it took me about ten minutes of contemplation to realise just how stupid that was. Thanks for confirming.

  14. This quote by Krugman has been doing the rounds, and it also explains what is going on here (Ear tip to Eli):

    “Paul Krugman
    Let me instead go meta; this is an example of why policy debate is so frustrating, and why I’m not polite. The key thing about how the conservative movement handles debate is that it never gives up an argument, no matter how often and how thoroughly it has been refuted. Oh, there will be more sophisticated arguments made too; but the zombie lies will be rolled out again and again, with little or no pushback from the “respectable” wing of the movement.

    In comments and elsewhere I fairly often encounter the pearl-clutchers, who want to know why I can’t politely disagree, since we’re all arguing in good faith, right? Wrong.”

    Chip is not discussing science, he is discussing advocacy, and he is not discussing his advocacy in good faith. He and Michaels will concede nothing, and they will never give up an argument “no matter how often and how thoroughly it has been refuted”. It is what they are paid to do.

  15. ‘Natural variability’ is as useful a term has being in a boat and hitting a reef and saying it’s okay because it’s a ‘natural’ reef and boats ‘naturally’ sink when hitting reefs. Your boat still sinks.

  16. Glenn Tamblyn

    From Peru

    This tool on the GISS site gives some interesting insights into where things were warming and how much and when. http://data.giss.nasa.gov/gistemp/time_series.html

    It is a plot of temperature for latitude vs year. And you can select to plot their composite land/ocean series, just land or just ocean. The wartime spike shows up clearly in the ocean only series and it is at a range of latitudes. In contrast, the land series shows a general warming period over the 20’s-40’s that isn’t as clearly defined and doesn’t start/stop as sharply. But when you look at the latitudes involved, it is largely the far north. It looks like there was a warm period in the Arctic during that time rather than a general global warming.

    Another important factor in this, as shown by the grey areas on the GISS plot. These are the latitudes where there is inadequate station coverage. We only seem to have had planet wide coverage since the 1950’s when the Antarctic began to be monitored. And Arctic coverage only appeared between 1910/1920. During the period in between the coverage was in transition. As a result there could well be a bias in the record as a result of this. Add in that the 30’s/40’s warming just happened to be in the Arctic where the new stations were added and I don’t think we can say how much of the land temperature record for that period is really that reliable..

    • “. . . could well be. . . I don’t think. . .”

      You know, these questions have been examined quite extensively, and with much more rigor than “I don’t think.”

  17. It’s curious that the GISS model deviates on the high side from the multi-model mean around 1940, because around 1940 real observed temps were in fact higher than the multi-model mean:

    (observations in black, multi-model mean in red).

    So probabiy the GISS model is doing a great job in reproducing the observed temperature.

  18. The only significant contribution of Santer et al. 2011, is that you need long time-series > 17yrs for a S/N ratio to be large enough to say anything about any trends. Everything else is duck soup. Chip, is trying to read behind the lines. Earth to Chip. There is nothing behind the lines!! As Santer et al. describe natural variability cannot explain the 30yr trends!! Therefore the signal is due to an external forcing rather than internal variability. The only known forcing that is “external” to the natural system, that can explain trend, is anthro-GHG. That is what Foster et al. showed!!

  19. I’d just like to point out the irony that Chip claims a lowered confidence in models and then he and Michaels highlight the Gillet et al paper, which is based on a single climate model. So models are bad, except when they give the results they like.

  20. > rebunked

    originated at Muddying the peer-reviewed literature in the comment by Rob Dekker at 53; approbation by others follows.

  21. Guys,

    You would think, considering the experience in the field of climate of the participants of this blog, that you all would have an opinion about the question at hand.

    Foster&Rahmstorf recently identified the signal of natural variability in the observed temperature record from 1979-2010, removed that signal, and arrived at an estimate of the underlying anthropogenic signal.

    I see no reason that the same technique could not be applied to individual climate model runs over the same period.

    My opinion is that if that were to be done, the resulting distribution of model trends over that period would be tighter than the distribution resulting from using the raw model projections (that include natural variability). It is also my opinion that the difference between the multimodel mean trend and the observed trend would be quite similar in the two situations. In my comments above (and in my MasterResource article), I provide some reasons behind my reasoning.

    Can anyone else to give an opinion as to what would result if F&R removed the natural signal from the model runs and then compared the resulting trends with the adjusted observations? Would the level of consistency be better, worse, or unchanged from that reported by Santer et al.? Or is the question ill=posed?

    -Chip

    • Oh, we have an idea alright. We thought it was evident to you, but since it is not …

      F&R showed that several well-known causes of natural variability correlate well with a lot of short-term variation but not for the long-term trend, which is “cleaned” if you use the data processing method they propose. These causes are perfectly known and quite well quantifiable using known physic (and computers) – they are not unknowns. So why trying to remove something well understood and working in a model to better match a reality *including* this variability ? If you want to remove variability from a model, you edit your source code, et voilà !

      Can you get it at last now ?

      If you want to suggest something meaningful, it would be to translate the causes of natural variability into temperature variation using physics and then remove this T variation from the *datas* to see if these causes of variability are well modeled. But sorry to burst your bubble, but it is more than likely that it was done before …
      And sorry to burst your bubble twice, I’m a seismologist and not a climate expert but I’m quite astonished by the fact you didn’t seem to get this obvious point …

    • Dikran Marsupial

      Chip, you say that in your opinion “that the difference between the multimodel mean trend and the observed trend would be quite similar in the two situations.”. In that case, what would be the point of performing the test a different way that would introduce a number of additional assumptions if you didn’t expect the result to be different? What would we learn from it?

    • CK: Foster&Rahmstorf recently identified the signal of natural variability in the observed temperature record from 1979-2010, removed that signal, and arrived at an estimate of the underlying anthropogenic signal.

      I see no reason that the same technique could not be applied to individual climate model runs over the same period.

      BPL: Because you have no reason to assume the natural variability is the same for a given model run as it is for the natural world. Do you even understand what “variability” means in this context, or why one model run might have a different “natural variability” than another by the same model?

  22. Chip:

    Or is the question ill=posed?

    History would suggest that it’s not asked in good faith …

  23. Chip,
    I think you are overselling what F&R2011 did. They showed that three sources of variability could explain the majority of the variability seen over the past 30-35 years. They most certainly did NOT model natural variability. They did not address other sources of variability. They did not address sources of variability that might have been important during other periods in the historical record. They did not address whether “noise” might itself be trending due to greenhouse forcing (which is critical for projection into the distant future or into the past). Perhaps most important, they did not produce a time series that can be used to estimate Charney sensitivity, since the climate most definitely is not in a state of radiative equilibrium.
    So, basically, yes, your question is ill posed. More to the point, though, your entire approach is not scientific. Scientific models and scientific theories are comparative rather than absolute. They may do some things well and some things poorly, and there may be different models that arre appropriate for different aspects/ranges of the system. However, you always have to have a model, or you ain’t doing science. You may contend that a particular model sucks at a particular task. Then the scientific approach is to propose modifications to the model(s) or if the model is sufficiently flawed, propose a new model that does better at all of the tasks the old model did.

    Such proposals have been notably lacking from the “skeptical” scientists, as “anything but CO2” is not a sufficiently detailed proposal for building a model. So we are stuck with the consensus model we have, and it is a model that has had no small success in helping us understand climate. We will do much better relying on that model and modifying it as this becomes necessary and feasible than we will by flying blind. There is every possibiltiy–even a likelihood–that when the dust settles (literally) and the planet reaches equilibrium, the models will have been too conservative.

    Science is not simply empericism. It is not simply conjecture and explanation. It is a complex interplay between empiricism guided by theory and theory falsified by empirical fact. It works and we should let it.

    Now when it comes to policy–that should be guided by engineering and risk management, and those are necessarily more conservative than science.

    • Ray, I think the reason why the climate “skeptics” go after the science rather than the policy, saying that, “the science is too uncertain” rather than, “… the science has a great deal of justification, but we shouldn’t do anything anyway…” may be a backhanded insult of sorts to the people they are deluding.

      The “skeptics” know that if the public actually knew how well-justified the science is, people would generally have a great deal of difficulty admitting to themselves just how much we are likely screwing over future generations, but they want their hummer anyway. In fact, I would be willing to bet that, short a complete sociopath, you would be hard-pressed to find anyone who could live with themselves admitting as much.

      Some people have an inkling but would find understanding inconvenient, and to this extent climate “skeptics” likely have willing accomplices in the people that they are deluding and the climate “skeptics” are likely counting on this. Regardles, if someone is in the dark as to the consequences for as of yet unborn future generations, they are going to have a fair amount of difficulty seeing just how bad the consequences will be for their grandchildren, children or themselves.

  24. Gavin's Pussycat

    Chip,

    I see several problems with the way you draw conclusions from the difference between model mean and satellite data trends. First of all, you assume (correctly I think) that natural variability and inter-model variability are independent of each other, allowing you to use addition/subtraction in quadrature. However, you should not assume that inter-model variability would be representative for model error; this is almost certainly wrong as also Santer et al. write:

    “The implicit assumption in all of our p-value calculations is that results from individual models are independent. This assumption is almost certainly unjustified [Masson and Knutti, 2011]”.

    What this means is that there are common-mode errors common to many or all models, making the model mean not a good measure for the “truth”, and variation about the model mean, for the “error”. James Annan has written about this, and if I understand him correctly (big if!) you should be looking whether the observation data could be a member of the distribution of model outputs. I would say that it could — as an edge case, even if you would manage to talk the yellow band a little narrower :-)

    Now one could ask what are those common mode errors. Before answering that question, consider that on the “test bench” we don’t just have here the models, but also the forcing time series that drive the models (and indeed, also the satellite data!). Of these forcings, especially the aerosols are iffy. There are some remarks on this in the Santer paper which you have no doubt read. Quoth Santer et al.:

    “Here, it is sufficient to note that many of the 20CEN/A1B simulations neglect negative forcings arising from stratospheric ozone depletion, volcanic dust, and indirect aerosol effects on clouds. Even CMIP-3 simulations which include these factors were performed roughly 7-10 years ago, and thus do not include solar irradiance changes over the last 11-year solar cycle [Wigley, 2010; Kaufmann et al., 2011], decreases in stratospheric water vapor concentrations over 2000 to 2009 [Solomon et al., 2010], and increases in volcanic aerosol loadings over the last decade [Vernier et al., 2011; Solomon et al., 2011]. It is likely that omission of these negative forcings contributes
    to the positive bias in the model average TLT trends in Figure 6F.”

    and, pertinent to my parenthesised remark above, and noting that describing the satellite time series as “observations” is a bit of a simplification:

    “Given the considerable technical challenges involved in adjusting satellite-based estimates of TLT changes for inhomogeneities [Mears et al., 2006, 2011b], a residual cool bias in the observations cannot be ruled out, and may also contribute to the offset between the model and observed average TLT trends.”

    I don’t know what your theory is on why the model trends are high; surely you have a theory. The way you propose to re-scale the 21st C projections by the ratio between model trends and satellite data trends suggests that you’re perhaps thinking that the models get the sensitivity wrong; in that case such a re-scaling could be appropriate. But what if (as I suspect) the problem is with the input forcing time series, like aerosols? Then the proposed re-scaling would be completely wrong: contrary to greenhouse gases, which accumulate in the atmosphere, aerosols rain out. Don’t you think you should first understand the nature of the discrepancies, before proposing a way of projecting them into an era where the greenhouse forcings — which, contrary to aerosols and friends, are relatively well understood and easy to model physically — will be quite a bit larger than today, and dominant?

    • Gavin’s Pussycat,

      You write:

      “James Annan has written about this, and if I understand him correctly (big if!) you should be looking whether the observation data could be a member of the distribution of model outputs. I would say that it could — as an edge case, even if you would manage to talk the yellow band a little narrower :-)”

      In fact, I am involved with a project with James doing this very thing with the results pretty much as you describe—and recognizing many of the same potential reasons as those aptly described in the Santer et al. paper.

      My point in all of this, is that F&R take a lot of the natural variability out of the LT temperature records and the result is very little change in the 32-yr trend. My (educated) guess is that if they did the same thing to the model runs, that the yellow confidence bounds in Santer et al. would probably shrink a bit—with the net result being that the observations would lie a bit further out in the lower tail of the distribution (and some of the reasons as to why they do so could be eliminated, i.e. the discrepancy results from natural variations). The host of this website seems not to agree with me about this.

      -Chip

      • “…with the net result being that the observations would lie a bit further out in the lower tail of the distribution (and some of the reasons as to why they do so could be eliminated, i.e. the discrepancy results from natural variations).”

        But on your website, you don’t point to this as a “path towards improving or better understanding” model/measurement discrepancies…I appreciate that there could be some scientific merit, but I must say that your intent seems more along the lines of “manufacturing doubt about models” than honest inquiry. It doesn’t look like you are interested in improving our understanding of global climate change, as much as grasping at any piece of information that might support your point of view.

        In reality, there are a lot of very interesting lines of inquiry that can lead to better understandings of our climate. Trying to massage confidence intervals so that one model/data “not-even-almost-discrepancy” is a tiny bit more “almost-please-even-closer-discrepant” is not one of those interesting lines of inquiry…

      • Gavin's Pussycat

        My point in all of this, is that F&R take a lot of the natural variability out of the LT temperature records and the result is very little change in the 32-yr trend.

        Yep, but note
        1) taking out solar and aerosols should actually not be done, as they remain in the model runs, so you create an apples vs. oranges comparison
        2) the removal of ENSO in fact only removes part of the unforced natural variability, the short-periodic (several years) part. It makes the temperature curve look much nicer, but you don’t know what longer period natural variation (decades) remains to infect the trend. And remember you have only one realization.

    • Gavin's Pussycat

      The way you propose to re-scale the 21st C projections by the ratio between model trends and satellite data trends

      Chip if you’re still reading, and commenting on myself: there is a simple way to do this in a more correct fashion. From the model meta-ensemble, select those that lie close to the data, and see how they continue into the future. You could do this more formally by Bayesian updating the model distribution using the satellite data (what James has been playing with), and look at the posterior for the rest of the century. Much better than making ad-hoc guesses.

  25. Chip,
    You would need to know the natural component in each of the model runs. Plus you’d have to consider the different inputs that different models have (some exclude solar, missing aerosols etc). Taking some of the actually measured variables of natural variability and using them to isolate the anthropogenic signal would only work if the natural variability in the models corresponded to those observed, which they do not.

    • Robert,

      You are correct. You couldn’t use the observed “natural variability” alone to adjust the model output, instead, you’d have to develop the “natural variability” signals for each model. For AOD and TSI, perhaps you could use them as prescribed, but for each model run a unique ‘ENSO’ signal would have to be derived.

      I am not suggesting that what I proposed was trivial to do. But I was giving my opinion as to what would happen if you did it.

      -Chip

      [Response: Quit the innocent act. You used your opinion to suggest the models were “on the verge of failing,” ignoring (of course) every aspect of the models except troposphere temperature trends. Your “opinion” was based on no facts, no analysis, just speculation — but when I called you on it you had the gall to ask me for *evidence*. The hypocrisy is astounding.

      FACT: you shrank the error range, without changing the estimated mean, for NO REASON other than wishful thinking. You never ran any numbers or did any computations — but you still concluded that the models were a failure and that warming in the next century is going to be “on the low end” of IPCC projections.

      You also utterly fail to comprehend just how disastrous the “low end” can be. You portrayed a total warming of half the difference between full-glacial and interglacial conditions as “lukewarm.” That’s every bit as disconnected from reality as your statistical hand-waving.]

      • Gavin's Pussycat

        Tamino, but you miss the elephant in the room — the whole idea of ‘falsifying’ the models in this way is flawed.
        To do that, you would rather have to show that the error budget does not close — and there is no a priori reason why the error budget couldn’t include systematic or ‘common-modÃe’ errors.
        Let me illustrate. Construct an artificial test case where
        1) natural variability is zero
        2) inter-model variability is zero
        3) there is a non-zero but very small common-mode error in all model outputs, cause unknown (or for all I care, known).
        Now, the model output and the data will have a very small separation, but the data will lie well outside the yellow zone, which in this imaginary example has a width of zero.
        Does this now mean that all those models are ‘worthless’? Of course not!

      • Gavin's Pussycat

        To elaborate, if you want to judge how “good” a model is, it is correct to look at the difference between a model quantity and the corresponding observed quantity — perhaps divided by the magnitude of the quantity to obtain a measure of relative “goodness” in percents. And, to state the obvious, just noting that this number is large (for certain values of “large”), is not enough to decide that the model is falsified.

        Note that the range of natural variability and/or the range of inter-model variation doesn’t come into the issue at all — they are completely irrelevant. If it were not, improving all the models so as to reduce their inter-model variation would suddenly make those models “poorer”, a silly result.

        And I want to point again to my earlier comment that you cannot test models in a vacuum: every legitimate*) such test is a comparison, you have to test against something. Chip and all those others that want so badly to falsify the models, coyly never mention any fall-back model that would replace those falsified. There is a reason for this: doing so would expose how empty their argument is. General circulation models are far from perfect — but for what they are doing they are way better than the competition!

        (Competition? What competition?)

        *) I don’t think hypothesis testing is the proper frame at all for this, as the fallback hypothesis would be something like “mainstream atmospheric physics is invalid”, a silly null if ever there were one.

  26. It’s also worth remembering that the data in question is RSS & UAH TLT. We already know that these data exhibit the lowest trends of the ‘big 5’ data sets.

    Not that I’m ‘wishing them away’ or anything–but they only tell part of the story.

  27. Chip claims,

    “My (educated) guess…”
    Now stop right there, you are not a climate scientist, never were, never will be. You are a wanna be. The fact remains that you are a paid misinformer who works for a serial liar and deleter of inconvenient data called Pat Michaels.

    Further, in your heart of hearts you know that you are clearly incapable of refuting AGW or the best estimate of equilibrium climate sensitivity of near +3 K, so instead you have to twist and fiddle scientists’ findings (even deleting data if necessary or omitting troublesome text) to contrive the answer you so dearly want. You are getting quite the rap sheet for falsifying scientists’ data and research.

    Other here and elsewhere have shown that you are ill-equipped to tackle this problem and have to ride on the coat tails of others who do know what they are doing.

    So don’t insult us by coming here and claiming that you are interested in advancing the science, or asking “innocent” questions. You are paid to advance a political agenda and dream up ever more creative (and desperate) ways to try and support your belief system.

  28. Can I point out the logical fallacy of Chip using models that are, as he believes, “….on the verge of failing” to show that climate sensitivity is low.

    So the same models that are in Chip’s alternate universe on the verge of failing can be used to dmeonstrate that climate sensitivity is low.

    Will wonders ever cease? So “skeptics” claim that the models suck, except when they are used to suggest a lower climate sensitivity, and then also with complete disregard of the caveats and limitations of the papers in question. Quite the double standard.

  29. Horatio Algeranon

    …the models are on the verge of failing

    Apparently the models are making progress! — are now passing (with a D-?)

    It seems that only yesterday*, we were being told that the (IPCC) model projections were failing — had been “FALSIFIED” and given a big fat F! (by the mathturbaticians, at least)

    *Yesterday, “Short-Term Trends” was such an easy game to play.
    Now I need a place to hide them away, Oh I believe in yesterday…

    • re: Apparently the models are making progress …

      I follow economics debates on Brad DeLong’s blog. And a similar thing is happening among fresh water economists. “Of course, a certain amount of fiscal stimulus is needed … etc etc. Anyone knows that. ” And DeLong then quotes the same economist dismissing fiscal stimulus a scant 3 years ago.

      The dismissal of GCMs by denialists and the dismissal of Keynes by right wing economists are the equivalent of a teenager’s naughty pictures posted to Facebook. They’re out there and they’ll haunt you the rest of your life.

      Unfortunately the reality of both spheres (climate and economics) is so dire that any “conversion” is a bit welcome.

  30. Rob Honeycutt

    I’m not even sure why anyone here is engaging Chip in a conversation. He’s not doing science. He’s doing FF industry funded propaganda masquerading as science. No matter how you slice it, you can’t add 2.7W/m-2 (over 5X the natural variation of solar output) of radiative forcing to the planet and expect that it’s not going to warm. Honestly, people like Knappenberger are suffering from a fully funded form of mass insanity.