Exogenous Redux

A reader recently requested that I revisit the issue of adjusting temperature data to allow for known factors, the ones that don’t affect the trend but do cause fluctuations, in order better to isolate the trend changes, which are mainly due to greenhouse gases. We can call those known influences exogenous factors. I’ve dealt with this many times, and published research about it, so I’m not going to say very much about the methodology.

The factors which will be estimated are the el Niño southern oscillation (estimated by MEI, the multivariate el Niño index), solar variation, and reflective aerosols (one of the consequences of volcanic eruptions). Then their estimated influence will be removed to show what global temperature might have looked like without these fluctuation influences.

We’ll start with data from NASA’s Goddard Institute for Space Studies. Here are annual averages (2015 is included although it’s not yet complete) of the basic data, without adjustment:


To estimate the influence of exogenous factors we create a mathematical model using monthly rather than annual averages. If we compare the model (shown as a red line) to the raw data (black line), it looks like this:


It’s actually impressive how well the model fits; clearly some of the big ups and downs in temperature fluctuations turn out to be due to these exogenous factors. For example, the exceptional heat in 1998, well above the overall trend, is because of the very strong el Niño of that year, while the temperature dip in the early-to-mid 1990s is due to reflective aerosols from the explosion of the Mt. Pinatubo volcano.

When we remove the estimated influence of the exogenous factors, then once again compute annual averages, we end up with this:


If you compare this with the first graph, you’ll see that the temperature trend is pretty much the same: steady increase from 1970 up to the present. However, the year-to-year fluctuations are reduced, which makes the steady trend even clearer than it was before. It also makes any talk of a so-called “pause” in global warming, look ridiculous. But then, such talk is ridiculous.

There are, of course, still plenty of fluctuations. One reason is that not all fluctuations are caused by el Niño, solar variation, or reflective aerosols. Another is that our model is a pretty good, but not perfect, estimate of their influence. Fluctuation still remains in our “adjusted” data — but it has been notably reduced.

NASA’s data are an estimate of surface temperature. Let’s also take a look at temperature in the atmosphere, specifically in the troposphere (the lower layer of earth’s atmosphere). I’ll do so with two data sets. One is based on measurements from thermometers carried aloft by balloons, which has been used to compute a global average called RATPAC (Radiosonde Atmospheric Temperature Products for Accessing Climate). We’ll use the RATPAC data for the lower troposphere (from atmospheric pressure 850mb to 300mb), so it leaves out the stratosphere.

The data are seasonal averages, with the seasons being DJF for winter, MAM spring, JJA summer, SON fall. I used the seasonal value for each month of the season in order to create a pseudo-monthly data set. And here’s the data averaged over each calendar year:


Here’s how the model based on exogenous factors (red line) compares to the data:


Once again the model is pretty good, although not perfect. Once again we can remove the estimated influence of exogenous factors, then re-compute annual averages:


Again, the steady rise is evident.

Balloon-borne thermometers aren’t the only data source for tropospheric temperature; there are also estimates from satellites. But they don’t actually measure temperature, they measure microwave brightness, from which we try to infer temperature at different levels within the atmosphere. Different channels (different microwave frequencies) respond with different strength to different levels of the atmosphere. But, none of them responds to a thin layer at a definitive altitude, they all respond broadly to pretty much the entire atmosphere, although they’re more sensitive to certain levels. This makes it a tricky problem to disentangle the information from different channels in order to estimate temperature in a well-defined atmospheric layer. It’s especially problematic because all channels respond, at least in part, to the stratosphere, which is known to be cooling because of increased CO2.

There is also a serious difficulty splicing together data from all the different satellites that have been used (more than a dozen). And, there’s the issue of different response depending on the angle at which the satellite is looking at the atmosphere, and some dispute about how the satellites’ orbits have altered over the years. All in all, it makes piecing together an accurate satellite temperature estimate quite difficult — which is one of the reasons satellite data have been revised and updated so often. The upshot is, that all those claims deniers constantly make about how the satellite data are somehow “better” than surface temperature data or balloon-borne thermometer data, are wrong. There’s very good reason to doubt.

Here’s the data from RSS, the satellite data that deniers seem to like best because it shows the least warming:


Here’s the model (in red) based on el Niño, solar variation, and cooling aerosols:


The model is definitely showing the real impact of these exogenous factors, but not quite as well as it does for surface temperature from thermometers. It is especially noteworthy that it doesn’t capture how extreme was the warming from the 1998 el Niño. Yet the exogenous-compensated data still shows consistent warming:


The warming, however, gives a visual impression of not being as steady as it was in thermometer data, although the idea of a so-called “pause” is still just as ridiculous.

We can see how different the satellite data is from the thermometer data for the troposphere, by plotting the difference between the two (annual averages):


This suggests that something happened around 2000 to cause these data sets to diverge. Thermometers didn’t change how they measure temperature, nor balloons how they rise through the atmosphere. But satellite instruments have gone through many changes, satellite orbits have altered, and the satellites themselves change over time. I strongly suspect that there’s a serious problem with the satellite data after about the year 2000, as indicated by their divergence from thermometer data.

We can also see the divergence in the data compensated for el Niño, solar, and aerosols, for which I’ll compare the monthly data:


Another sign is how the overall warming rates compare between sources. For surface temperature data from NASA, the warming rate is about 0.0166 +/- 0.0028 deg.C/yr. It’s slightly higher (but not significantly so) for RATPAC data, at 0.0174 +/- 0.0036 deg.C/yr. But the odd man out is the data from RSS, warming at only 0.0127 +/- 0.0048 deg.C/yr.

In my opinion, it’s high time to take a much closer look at the satellite data, how it’s processed, and how it compares to other sources.

36 responses to “Exogenous Redux

  1. In my opinion, it’s high time to take a much closer look at the satellite data, how it’s processed, and how it compares to other sources.

    Is there a reason why the discussion of satellite-derived atmospheric temperatures seems to be dominated by time series from just two research groups (RSS and UAH)?

    • There have been more people studying microwave satellite tropospheric temperature trends. I guess they never published their data and do not keep on extending the dataset with new observations. It is somewhat outside my expertise, but as far as I know UAH and RSS are thus the only two operational datasets.

      There are not many users that need to know tropospheric temperatures. Humans and the ecosystems they depend on live near the ground. The microwave satellite series are still rather short and likely unreliable with respect to the trend for all the reasons mentioned in the post. Thus they do not provide much additional information to understand the climate system and only a few scientists work on it once in while.

      There are temperature datasets from infra-red satellites, which is a much more direct measurement and might thus be more reliable. They again have problems with clouds, have the same problems with the short life span of satellites and these series are even shorter.

      In this case it is somewhat unfortunate that science sets its priorities based on what will likely bring most scientific progress. The satellites temperatures are not high on that list. That is is a hot topic in the “climate debate” does not make it a research priority.

      It might make a nice casting show. Scientists present their research proposals, a panel asks obnoxious questions and the public can televote, which proposal should be funded. For a small part of the science funding (less than 1%) this could be a fun idea, automatically includes outreach and makes it clear how hard it is to judge science projects in advance.

  2. Is there a reason why the discussion of satellite-derived atmospheric temperatures seems to be dominated by time series from just two research groups (RSS and UAH)?

    I expect because they are far and away the best-known and have the easiest data availability. I know where to go to download data from either RSS or UAH (well, RSS might take a little sleuthing, but I’m pretty sure I’ve done it once or twice in the past.) By contrast, I couldn’t even name another satellite data product, just off the top of my head.

    • The other continually updated analysis of tropospheric temperature is from NOAA STAR which uses a novel “simultaneous nadir overpass” method to apply corrections. They do not have a TLT product however, although they had planned one in the past, and perhaps still do. Their TMT series (which includes influence from the stratosphere) shows warming of 0.103 degrees per decade. This compares to 0.078 for RSS and 0.070 for UAH (v6 beta3).

      The NOAA star analysis is documented here:

      • It’s easy to construct your own TLT product from STAR channel data.
        An UAH v6 equivalent can be constructed this way: LT = 1.538*MT -0.548*TP +0.01*LS
        A RSS TTT product equivalent is made with the formula: 1.1*TMT- 0.1*TLS
        (what UAH and RSS do with every single measurement can be done with monthly global averages as well)
        Both alternatives above with STAR data give trends of 0.14 C / decade over the full available period.

        I would also like to see a global satellite dataset with the Po-Chedley method

        Click to access jtech.pochedley.2015.pdf

        I guess that it would give global TLT trends of at least 0.15-0.16 ( c f Table 4)

  3. What about UAH v 5.6 TLT data ? I’ve plotted rss – uah and got similar picture to rss – ratpac (didnt do analysis, but graph does look similar). I guess RSS is the real outlier here (and so might be UAH 6.0).

  4. Harry Twinotter

    “This suggests that something happened around 2000 to cause these data sets to diverge.”

    Thanks for saying that, it matches a conclusion I came to, albeit just by looking at the charts without analysing them in any rigorous fashion. I wondered if it could be related to the 1997/98 El Nino, PDO phase or just a measurement error? A climate website did speculate the lapse rate may have changed due to changes in the humidity of the troposphere.

    Anyway it will be interesting to see how the satellite measurements respond to the development of the current large El Nino.

  5. The changes in trend in the difference between RSS and the radiosondes (Radiosonde Atmospheric Temperature Products for Accessing Climate) could also be due to the radiosondes. Radiosondes are not inherently better than satellites.

    Radiosondes are used only once and measuring temperature over such a huge range of temperatures is very difficult. In the tropics the temperature can change by 100°C (200F) with height. The air is thin, which makes ventilation hard, while the sun in the tropics is harsh. The balloons flow with the wind, thus the relative wind is low. The sondes get wet during rain and in clouds and when that water evaporates it cools.

    The main reason to expect that the problem is the satellites is the comparison with surface stations, which are also not perfect, but quite good and have a decent sized community behind them working on their quality.

    UAH and RSS have two people, I think, who work on this data once in a while.

    • Radiosonde data is also geographically biased. For obvious reasons they tend to be launched from land, and there is uneven sampling in key regions for variability such as the Pacific. If you create a masked RSS average using RATPAC station sampling you’ll find a reduction in the discrepancy.

      A number of RATPAC stations drop out over the 2000s, which is potentially a problem since there are only 85 to begin, though I haven’t checked to see if there’s a definitive bias due to that.

      I would suggest much of the RATPAC-RSS difference is due to geographical sampling bias being strongly activated by changes in Pacific variability since the late 1990s.

      Having said that, individual RATPAC station to RSS/UAH6.0-grid comparisons for the most recent two or three years do look consistently off.

      • Paul S, i don’t think it possible make a RATPAC A mask. This index is not a simple average of all stations, but a regionally and globally balanced index. It believe though it would be possible to pick all single stations from RATPAC B and make a simple average to compare with a RATPAC masked RSS.

        However, applying a RATPAC mask would likely not make the trend break at 2000 go away. If you look at http://images.remss.com/msu/msu_amsu_radiosonde_validation.html
        you can compare masked RSS with other radiosonde datasets. By eye, it looks like RSS diverge from all of them, starting at about 1998-2000. I actually digitized the HADAT one, took the differences, and found that the trend break was of similar magnitude as the present RATPAC comparison. (Actually the trend break was slightly less pronounced when I used Raw global RSS vs downloaded HadAT data)

      • Olof,

        The overall trends in those radiosonde-satellite comparisons are basically indistinguishable. There does seem to be some temporal structure, such as satellites being relatively warm around late 90s/2000, but overall errors seem to balance out.

        Yes, I’m talking about masking to stations listed in RATPAC-B. I did an analysis a few weeks ago comparing to UAHv6.0 and it took a while so I’m not keen on repeating for RSS, but it looks like the results would be similar.

        Firstly, a plot of global UAH6.0 against global RATPAC-A, for direct comparison with Tamino’s graph above, along with global UAH6.0 minus a simple unweighted average of RATPAC-B station anomalies. Both show a very similar effect to that depicted against RSS: Figure 1.

        Now, to show the effect of sampling, a plot of global UAH6.0 against a simple unweighted average of UAH6.0 grid cells containing RATPAC-B stations: Figure 2.

        It can be seen that a substantial cooling is created simply through sampling bias. However, the variability is not quite right. To examine that, a plot showing the RATPAC-B station average against the UAH masked average, which should reveal time-dependent radiosonde-MSU biases: Figure 3.

        There’s no real trend over the full period, supporting the RSS comparisons you linked. However, there does seem to be some structure, in particular the period around 2000 when satellites report quite a bit more warmth. Having made some regional comparisons this seems to be fairly consistent so may point towards some kind of discontinuity in the satellite data. Of course, having a high at this time, even though consistent with no systematic bias through the full record, suggests some bias in trends drawn from the late 90s.

  6. Thanks, Victor. Helpful comments that fill in some gaps in my understanding.

    Some of your points also provide an explanation for why the ‘skeptics’ focus so much attention on RSS & UAH MSU and radiosonde temperature data.

  7. Victor, the trendbreak is similar if reanalysis data for the 850-300 hPa layer (e.g from NCEP/NCAR or ECMWF) is subtracted from the satellite data. Thus, the satellite TLT:s are the outliers, not RATPAC or any other radiosonde dataset.

    Anyway, the trendbreak in the TLT trends at the turn of the century is puzzling. I have suspected the MSU-AMSU transition for that. The first AMSU, NOAA-15, was launched in May 1998 and is still in service. Thus, the AMSU channel 5 (TMT) on this very satellite is the backbone of all TLT measurements since 1998. Has it been some yet unknown drift or error in this specific instrument?
    i have found an article by the NOAA in-house expert Tsan Mo, on the performance of the AMSUs onboard NOAA-15
    I am certainly not an expert on this stuff, but it strikes me that there is a big difference between the temperature trends of channel 5 and the nearby channel 4 ( in Table 1). The measured channel 5 frequency is also out of specifications (Table A1).
    However, this is a task for the experts in the field. My gut feeling is that the satellite datasets still are very much “beta”, and they will be so until they can be validated against radiosonde or reanalysis troposphere data (or surface data).

    • Interesting.

      I was only arguing that the comparison of satellite and radiosonde in isolation was not real evidence that it has to be the satellite that is the wrong one. Together with surface temperatures and reanalysis it becomes a better case.

      Something to study, if only science were interested in satellite temperatures. It only becomes convincing when we understand the reasons. That is hard for satellites; we do not have access to the instruments to see how they fail.

    • Yes.

      Also there are significant issues with the SSU series (stratosphere). The AMSU introduction issue has been swept under the rug, but there is a history of subtle instrumental problems affecting long tern series such as total solar insolation.where the baffling was not quite what was thought it was.

  8. I went to a seminar on climate quality reanalyses, and they showed a few examples of discontinuities in reanalysis timeseries caused by the introduction of new satellite data.

    The obvious explanation is that it is due to a new instrument on a new satellite, but since there have been more than two satellites that have contributed to the timeseries I am surprised to see only one discontinuity.

  9. Stupid question: where would one get unadjusted temp data for the major datasets?

    [Response: Look here.]

  10. Are you applying the model from the earlier paper but with new values for volcanoes, solar etc, or have you recalculated the model with updated data but same methodology? If the later, then It would be interesting to see how well the earlier model performed (effectively using data prior to paper publication as training set and then estimating temperature since then).

    [Response: The model is re-calculated. NASA data change as new reports arrive from meteorological stations, and they’re now using the updated sea-surface temeparture data set ERSSTv3 rather than the old ERSSTv2, and previously I didn’t even do RATPAC data.

    Perhaps I’ll train the model using only the data up to 2000, then apply it to the last 15 for validation.]

  11. I posted a link to this on Euan Mearns’s site. He doesn’t seem to think much of it. Here is one of his comments but he also wrote, in an earlier comment, that there was so much wrong with your post, he didn’t know where to start. As this post seems quite reasonable to me, I wonder why he sees something so different? He thinks that the residual, in your third chart can be explained by the increasing trend of the AMOC, which you don’t mention.

    • The AMO isn’t going to stop greenhouse gases from trapping more of the sun’s heat in the atmosphere and oceans. That ‘cyclomania’ or ‘stadium wave’ stuff is pure crank material.

      • You have to remember, Rob, that these people don’t accept the science of greenhouse gases so they will look at all sorts of things to explain what we’re experiencing. Euan has found that, for him, the AMO explains most of the warming once other natural factors are removed. To me, his argument is weak but I haven’t yet worked out the details. I was hoping someone with more science skills might spot obvious flaws. But he’s also fairly insulting about this post of Tamino’s and needs to be put right.

      • Not sure if this got through the first time, so apologies for any repetition.

        Yeah, Rob, I think his argument is weak but he seems to think that the AMO accounts for all of the warming after removing other natural factors. I haven’t quite pinpointed his error yet and was hoping someone here might be able to home in on obvious flaws quickly.

        You have to remember that Euan and some of his readers don’t accept the science of greenhouse gases, which is why he’s desperately searching for other causes of the warming.

    • Ask him if the AMO explains SLR, because according to my understanding of physics, the only things that explain it is 1) warming water 2) melting land ice 3) and a fair amount of pumped ground water. 66 or 33 year Atlantic cycles don’t explain 100+ years of essentially continuous sea level rise.

      • 60-year cycle

        60-year cycle

        Personally, I think the AMO is pretty much just along for the ride and not really a cause of anything.

        [Response: Evidence for 60-year cycle is flimsy at best. But there have certainly been fluctuations.]

      • You might be able to argue that the ocean is exchanging energy with the atmosphere, which is causing air temperature to rise, but you can’t argue that ocean cycles are causing the ocean to warm. That energy has to come from somewhere. That is, unless there is a Rube Goldberg style set of feedbacks (clouds, whatever) that change the energy balance. It would be laughable to suggest something like this, while simultaneously discounting the effect of increased GHG concentrations.

      • “It would be laughable to suggest something like this, while simultaneously discounting the effect of increased GHG concentrations.”

        Um, ‘would be?’ ;-)

      • Good point, all. I don’t think Euan is listening, though. Not surprisingly.

  12. Tamino – have you considered including a nonlinear ENSO (MEI) dependence? The fact that it seems not to be quite capturing peaks and troughs suggests perhaps adding a quadratic term to the linear one in the multivariate fit would help…

    [Response: Yes, I have indeed. I strongly suspect that, as you say, the effect is nonlinear. One might also include a cubic term, since an extreme el Nino might behave differently than an extreme la Nina.

    It is work, and takes time away from other things … but sounds like time well spent.]

  13. On Climate Etc. I’ve been trashing the satellites for a few years. Their surface anomaly stuff is junk. My personal theory is a convenient balance between El Nino and La Nina throughout much of the satellite era have given them the appearance of being reasonably accurate… accidental accuracy. With the prolonged period of ENSO neutral and La Nina dominance, they’ve gone completely haywire.

    I’m just guessing as a lay person, so it will be very interesting to learn the actual reason for the divergence.

  14. Excellent post overall Tamino, with some interesting comments, but am I the only old fogey here? I’m no Latin scholar, but “data” is a plural word (well, except for the android Commander Data on Star Trek: The Next Generation, played by Brent Spiner)! “Datum” is singular, “data” plural. The satellite data (& perhaps the radiosonde data) ARE unreliable, especially after 2000.

    Roy Spencer (UAH) released his version 6 data on 2015 Apr 28, with the stated intention of having “open” peer review. I’m shocked, SHOCKED (given Spencer’s history), that they show less warming recently than the previous version 5.6. Perhaps someone with more expertise than I can figure out what happened in 2000, & clear up the seemingly even worse v6 from UAH.

  15. Just poking around int he data, and LOL I found my best ever cherry pick. If you compute the trend since 2011.
    Its statistically “significant” (>2sigma above 0)
    (Well it is for small cheaty values of significant)
    (not only did I cherry pick trend I cherry picked a period of apparently low noise!)
    Its also been warming really fast. 0.622/decade (+-0.560)

  16. Nothing to add, except many thanks for your answer !