A recent post on RealClimate discusses how easy it is to get fooled when analyzing data. It focuses on two recent papers, which critique other papers that go astray in assessing the relationship between solar activity and climate. I’d like to discuss one of those critiques, Legras et al. 2010, A critical look at solar-climate relationships from long temperature series (Climate of the Past, 6, 745-758, doi:10.5194/cp-6-745-2010).
The authors respond to Le Mouël et al. (2010, A solar pattern in the longest temperature series from three stations in Europe, J. Atmos. Solar-Terr. Phys., 72, 62–76, doi:10.1016/j.jastp.2009.10.009) and its companion paper by Kossobokov et al. (2010, A statistically significant signature of multi-decadal solar activity changes in atmospheric temperatures at three European stations, J. Atmos. Solar-Terr. Phys., 72, 595–606, doi:10.1016/j.jastp.2010.02.016). Those works indicate strong correlation between solar activity and temperature in three very long European temperature records, from Praha (Czech Republic), Bologna (Italy), and Uccle (Belgium). The source data for these studies is from the European Climate Assessment & Dataset Project (ECA&D).
Legras et al. show convincingly that the correlations are spurious, for three reasons. First, the correlations fail to take into account other, known factors which influence temperature. These include volcanic eruptions and man-made greenhouse gases. Especially when the other factors themselves correlate with indices of solar activity, a naive analysis which omits one factor can confuse the effect of competing influences. Unfortunately such a mistake is all too common; as Legras et al. state:
It has been demonstrated that a correlation analysis which takes only one cause into account can lead to a spurious attribution (e.g., Scafetta and West, 2006a,b, 2007 criticized by Benestad and Schmidt, 2009).
They further note:
Multiple causes should also be considered when studying other individual forcings. Indeed, several authors showed that a correct evaluation of the climatic impact of the 1991 Pinatubo eruption should account for the global temperature modulation by ENSO (Soden et al., 2002; Robock, 2003; Hansen et al., 2005).
I’ve seen this sort of problem before. It’s routine for those who deny the reality of man-made global warming to claim that the “Dalton minimum” of sunspot counts from 1790 to 1830 caused a dramatic drop in temperature. This led to what “nexus 6” titled “The worst climate science paper ever of all time anywhere,” referring to Archibald 2006, Solar Cycles 24 and 25 and Predicted Climate Response Energy and Environment, 17, 29-38 (are you surprised it appears in E&E?).
In fact it’s easily shown that any decline is limited to a very brief period, and is clearly due to volcanic activity, including the Tambora eruption, an explosion so violent it was heard from over a thousand miles away. The issue of the temperature effect of the Dalton minimum (DM) was studied in Wagner & Zorita 2005 (Climate Dynamics, DOI 10.1007/s00382-005-0029-0), which concluded that “… solar and the CO2 variability have not contributed in an important way to the formation of a thermal DM, and that volcanic forcing was largely responsible for reduced temperatures in the DM.”
The second major problem is that the data used are inhomogeneous. Unfortunately it appears that Le Mouel et al. failed to check the metadata for the stations they used. As Legras et al. point out:
Praha, Uccle and Bologna are among the “suspect” stations (see Table 1) extracted from ECA&D website and this contradicts LMKC who mention those series as having the highest quality code in ECA&D, for both TN and TX temperatures … The lack of homogeneity over the 20th century is, of course, a serious warning about the quality of data over the 18th and 19th centuries.
A particularly glaring example is the daily high temperature (TX) from Bologna, which shows an obvious artifact which is clear from visual inspection of the data:
and is even clearer when one smooths the data on a 1-year time scale:
Legras et al. investigated the metadata for this station and discovered:
After checking Bologna metadata, we found that in 1867 the “Grindel” thermometer, in Reaumur scale, read four times a day at 9 a.m., 12 p.m., 3 p.m. and 9 p.m., was changed to a “Milano” min-max thermometer in Celsius scale. In 1881, the thermometers were relocated to a different place (Michele Brunetti, CNR-ISAC, personal communication, quoting Capra, 1939). The 1867 change is listed in the ECA&D metadata (http://eca.knmi.nl/utils/stationdetail.php?stationid=169). LMKC state that:
It is a general observation that one must trust the way ancient observers did the maximum they thought possible to obtain the best data
Of course, the observers did the best they could, but this does not ensure that the data are reliable.
Legras et al. conclude that:
As a conclusion, Le Mouel et al. (2008, 2009) and LMKC results are all based on raw inhomogeneous data, contrary to their claims. This is quite striking, since information about data quality is easily available from ECA&D website.
The third major problem is that the statistical analysis wasn’t done properly. There are a number of small problems, but the major difficulty is that Le Mouel et al. failed to account for autocorrelation properly. As Legras et al. state, “The first step in LMKC is to calculate a 21-day moving average of the temperatures over the whole dataset.” It’s then necessary to compute the variance of those quantities, but as Legras et al. say, “… LMKC assume that the daily temperature fluctuations are independent.” But that is not the case. Here, for instance, is the autocorrelation function of temperature anomaly from Praha (which is typical of all three stations):
Legras et al. estimate that the actual variance is nine times as great as estimated by LeMouel et al. meaning that the standard errors in their correlations should be three times larger than stated.
And in their principal graph comparing correlations to their probable errors, the error ranges plotted by LeMouel et al. are only 1-sigma. That makes them “68% confidence” error ranges, hardly suitable for evaluation of statistical significance. Legras et al. argue for a 90% confidence interval, which would be 1.65 times larger. Personally, I’d want 95% confidence which would be 1.96 times larger. And when taking into account that the standard errors are three times greater than estimated, the actual confidence interval should be nearly six times bigger than shown by LeMouel et al.
The bottom line is well stated by Legras et al.:
Our unequivocal conclusion is that the results of LMKC and KLMC, claiming a strong signature of solar influence on local temperature records, with amplitude up to 1 C, are invalid.
I quite agree.