In light of Anthony Watts’ latest idiocy comparing GISS and UAH temperature data without bothering to put them on the same scale, I thought it might be interesting to compare different temperature records … but let’s do it right, eh?
There are 5 major sources of global temperature data which are most often referred to. Three of them are estimates of surface temperature, from NASA GISS (Goddard Institute for Space Studies), HadCRU (Hadley Centre/Climate Research Unit in the U.K.), and NCDC (National Climate Data Center). The other two are estimates of lower-troposphere temperature, from RSS (Remote Sensing Systems) and UAH (Univ. of Alabama at Huntsville). All are anomaly data, i.e., the difference between temperature at a given time and that during a baseline period. They tend not to be on the same baseline; for GISS the baseline is 1951 to 1980, for HadCRU it’s 1961 to 1990, for NCDC it’s the 20th century, and for satellite data the baseline is 1979 to 1999. Since they use different baselines, they’re on a different scale, i.e., each has its own zero point for temperature. To compare them, we need to use the same zero point for all.
They also don’t cover the same time span. HadCRU starts first, beginning in 1850. GISS and NCDC both start in 1880. And the satellite data don’t start until December 1978 (for UAH) or January 1979 (for RSS). You can download the data yourself; links to data sources are found here. Because of this, the RSS and UAH data cannot be put on the baseline used by any of the surface-temperature data sets because the satellite data don’t cover those time periods. Of course we can only compare them for those times they all have data. And to put them all on the same scale, we’ll have to use a baseline period which is covered by all.
All 5 data sets cover the period 1979 to the present, although HadCRU hasn’t yet published their results for November 2010, so the period of common coverage is January 1979 to October 2010. Here’s the raw data (each with its own baseline period):
We can smooth the month-to-month fluctuations by using a 12-month moving average filter, giving this:
Now we can plainly see that all they tell much the same story, in terms of the temperature changes over time. Which is what anomalies are meant to reveal.
But we can also see the result of using different baselines. GISS and NCDC are nearly the same, because the average for the GISS baseline period (1951-1980) is nearly the same as that for the NCDC baseline (20th century). HadCRUT3v is lower because it’s baseline period (1961-1990) is warmer (so it’s compared to a warmer reference). Finally, the satellite data sets are lowest because their baseline period is warmest.
For proper comparison we should choose a common baseline for all five data sets. I chose the period 1980.0 to 2000.0, which gives this for the monthly data:
and this for the 12-month running means:
Note that now the different data sets are in much closer numerical agreement. They all show warming during the coverage period, and they all show fluctuations superimposed on the warming trend. But the satellite data sets show greater fluctuations, especially during el Nino events (e.g. 1998) and la Nina events (2008), and during the coolings associated with volcanic eruptions (El Chicon in the early 1980s and Mt. Pinatubo in the early 1990s).
Therefore the most prominent pattern in the data appears to be that which is shared by all: an overall warming trend, and warming in response to el Nino, cooling in response to la Nina and volcanic eruptions. The 2nd-most prominent pattern appears to be the difference between the satellite data sets (RSS and UAH) and the surface-temperature data sets (GISS, HadCRUT3v, and NCDC).
We can test that idea by performing a principal components analysis of these data sets. The 1st principal component accounts for 90% of the variance of the data, so it dominates the fluctuations. It turns out to be nearly equal to the average of all five data sets, and the signal associated with it (the 1st empirical orthogonal function or EOF) is, just as we expected, the warming-with-fluctuations which is common to all (I’ve scaled it so that it’s on a “temperature” scale):
All 5 data sets agree: the globe is warming.
The 2nd principal component accounts for 7% of the total variance, which is most of the remainder after accounting for the 1st principal component, and confirms our intuition that the 2nd-most prominent pattern is the difference between satellite and surface-temperature data. Here’s the actual 2nd principal component vector (the “loadings”):
Note that the satellite data sets have positive coefficients while the surface-temperature data sets have negative coefficients. Hence the EOF associated with this PC is very similar to the difference between the satellite average and the surface-temperature average, and looks like this:
We can compare that to what results from subtracting the average of surface temperature estimates from the average of satellite measurements:
The biggest difference between the satellite-minus-surface data and PC#2 is that PC#2 shows an additional downward trend. This is mainly because one of the satellite data sets (UAH) shows an overall trend which is decidedly less than that of the other data sets.
We can plainly see the highs during the 1998 and 2010 el Ninos, and the lows during the 2008 la Nina as well as the volcanic coolings in the early 1980s and early 1990s. This indicates that the satellite data (i.e., the lower-troposphere temperature) responds more strongly to the influence of el Nino/la Nina and to volcanic eruptions, than does the surface temperature.
An interesting result is that for PC#5:
Although it accounts for the least total variance of the data (a mere 0.3%), it shows fluctuations which suggest an annual cycle. Its presence is confirmed by a Fourier analysis of PC#5:
We see a peak at frequency 1 cycle/yr (period 1 yr) together with its harmonics at 2, 3, and 4 cycles/yr. So, not only is there an annual cycle in PC#5, its form is not simply sinusoidal. We can see the cycle shape by making a folded plot (a.k.a. “phase diagram”), graphing temperature not as a function of time but as a function of phase, i.e., time of year (as is customary, I’ve plotted two full cycles of phase:
Here is the actual principal components vector (the “loadings”):
All but 2 of the coefficients are very small, so PC#5 turns out to be mainly the difference between NCDC and HadCRUT3v. Hence we see that their difference shows an annual cycle, because during this time span NCDC is warmer in winter and cooler in summer than HadCRUT3v, although there’s also a “dip” in January-February compared to December and March.
This illustrates that although the choice of baseline period makes no difference when computing the trend (i.e., the rate of global warming), it does make a difference when estimating the annual (seasonal) cycle. When anomalies are computed, not only does it set the “zero point” of temperature to the baseline average, it also removes the annual cycle from the data. But it removes the average annual cycle during the baseline period. If the annual cycle changes, then the difference between “present” and “baseline” annual cycles will remain — a “residual” annual cycle. PC#5 shows that the residual annual cycles in NCDC and HadCRUT3v are different — hence a difference in annual cycle “remnants” is found in PC#5.
A point of much interest is the trend, i.e., the warming rate, shown by each series. We can compute them for each data series separately, and also compute uncertainty levels for those estimates (which are corrected for the influence of autocorrelation, confidence intervals are 2-sigma):
They’re all close, all within each others’ confidence intervals, and they’re all definitely positive (warming). However, the UAH trend estimate is visibly lower than that of the others — if any of the series should be called the “odd man out,” it’s the UAH data.
For some reason “the Blackboard” has an obsession with trends over the most recent 10-year period. Here they are (plotted in blue), compared to the trend over the entire time span common to all data sets (plotted in red):
None of the 10-year trends is “statistically significant” but that’s only because the uncertainties are so large — 10 years isn’t long enough to determine the warming trend with sufficient precision. Note that for each data set, the full-sample (about 30 years) trend is within the confidence interval of the 10-year trend — so there’s no evidence, from any of the data sets, that the trend over the last decade is different from the modern global warming trend.
When one compares the different global temperature data sets correctly, one result emerges more strongly than any other: that they agree. This puts the lie (yes, lie) to claims of “fraud” by climate scientists to rig the surface temperature data.
And what do all the data sets agree on? Mainly this: global warming.
Here are the data, for their period of overlap, as an Excel file: