Open Mind

PCA part 5: Non-Centered PCA, and Multiple Regressions

March 19, 2008 · 147 Comments

Time for more equations. Yay!!! (or Boo!, depending on your perspective)


We’ve seen that principal components analysis (PCA) identifies the directions in a multi-dimensional data space along which data show the greatest variation. It’s customary for variation to be computed from the origin of this data space, which is chosen as the average value of the data, in which case we replace our original data values x_a(t) (for a=1,2,...,n data series) by their zeroed values

z_a(t) = x_a(t) - (1/N) \sum_{(times~t)} x_a(t).

These zeroed data series have the useful property that their average values over time are all zero

(1/N) \sum_{(times~t)} z_a(t) = 0 for all a.

We may even scale these values so that all the series have the same variance. Within our data space, the directions of maximum variation lie parallel to the eigenvectors of the matrix which is the sum over time, of the products of data values:

Z_{ab} = (1/N) \sum_{(times~t)} z_a(t) z_b(t).


So far, we’ve just reiterated what we’ve seen before. Now for something new: what happens if the origin of our multidimensional space is chosen other than the average values of the data series, or equivalently if the average values of the data series are not equal to zero? Within the framework defined by the zeroed series, let’s choose a particular point c_a as our origin. We can think of the quantities c_a as the coordinates of our new origin, or as a vector from the old origin to the new one. In any case, using a different origin is equivalent to replacing the zeroed data series with new series:

y_a(t) = z_a(t) - c_a.

Note that the origin c_a has no dependence on t. We then replace the original correlation/covariance/whatever matrix Z_{ab} with a new one

Y_{ab} = (1/N) \sum_{(times~t)} y_a(t) y_b(t).

We can gain some insight by noting that

Y_{ab} = (1/N) \sum_{(times~t)} [z_a(t)-c_a] [z_b(t)-c_b]
= (1/N) \sum_{(times~t)} [z_a(t) z_b(t) - z_a(t) c_b - c_a z_b(t) + c_a c_b].

Because c_a has no dependence on t, and (1/N) \sum_{(times~t)} z_a(t) = 0, the 2nd and 3rd terms give zero. So we have that

Y_{ab} = (1/N) \sum_{(times~t)} [z_a(t) z_b(t) + c_a c_b] = Z_{ab} + c_a c_b.

Therefore using the point c_a as a new origin, is equivalent to adding the matrix c_a c_b to the original matrix Z_{ab} to get our new matrix Y_{ab} (the matrix c_a c_b is the “outer product” of the vector c_a with itself).

Now suppose that we already know the eigenvectors of the original matrix Z_{ab}. Let \nu_a be one of the eigenvectors, with eigenvalue \lambda, and assume that we’ve normalized that vector so that its squared length is 1, i.e., \sum_a \nu_a \nu_a = 1. If we multiply our new matrix into this eigenvector, we get

\sum_{b=1}^n Y_{ab} \nu_b = \sum_b [Z_{ab} + c_a c_b] \nu_b = \lambda \nu_a + c_a \sum_b c_b \nu_b.

Now let’s consider two interesting cases. First, suppose that the “old” eigenvector \nu_a is perpendicular to the new-origin vector c_a. In that case, \sum_b c_b \nu_b = 0, and we have

\sum_{b=1}^n Y_{ab} \nu_b = \lambda \nu_a.

Hence the “old” eigenvector is also a “new” eigenvector, i.e., it’s an eigenvector of the new matrix Y_{ab} as well as the old matrix Z_{ab}, and the eigenvalue is unchanged.

Now suppose that the old eigenvector \nu_a is parallel to the new-origin vector c_a. In that case, there’s some constant c such that the new-origin vector is that constant, times the eigenvector, i.e.,

c_a = c \nu_a.

In fact c is just plus or minus the length of the vector from old origin to new. Then we have

\sum_{b=1}^n Y_{ab} \nu_b = \lambda \nu_a + c \nu_a \sum_b c \nu_b \nu_b = \lambda \nu_a + c^2 \nu_a = (\lambda + c^2) \nu_a.

Again the old eigenvector is a new eigenvector, but this time the eigenvalue has changed from \lambda to \lambda + c^2.

If the new-origin vector c_a is parallel to any one of the old eigenvectors, then it’s perpendicular to all the other eigenvectors because for a real symmetric matrix (like Z_{ab} and Y_{ab}) all the eigenvectors are perpendicular to each other. In this case, switching to the new origin causes the eigenvectors to remain unchanged, while all the eigenvalues are also unchanged except the one corresponding to the eigenvector parallel to our new-origin vector. If we regard the eigenvectors as principal components, then they remain unchanged, and their strengths (as measured by their total variations from the origin, which happen to equal the eigenvalues) also remain unchanged except for the one parallel to our new-origin eigenvector. That eigenvalue increases, it cannot decrease, because it changes by +c^2, and we can safely assume c is real, so its square is positive. The net result is that the principal components remain unchanged, but the strength of one of them increases by c^2.

If the eigenvector \nu_a is neither parallel nor perpendicular to the new-origin vector c_a, then the situation is more complicated. The directions of the eigenvectors will change, as will their eigenvalues. However, if the new-origin vector is nearly parallel to one of the old eigenvectors, then the new eigenvectors will be nearly the same as the old ones, and the eigenvalues will be nearly the same except for the eigenvector nearly parallel to our new-origin vector, which will be increased by nearly c^2.

Having looked at the principal components from the dispute about centered and non-centered PCA in Mann, Bradley, and Hughes 1998 (MBH98), I conclude that this is the case for the non-centered PCA applied to the North American ITRDB (International Tree Ring DataBase) data. The new-origin vector is nearly parallel to PC#4 (the eigenvector with the 4th-strongest eigenvalue) from centered analysis. Hence the eigenvectors are nearly unchanged, as are the eigenvalues, except for the eigenvalue associated with the PC formerly known as #4. Its eigenvalue is increased by nearly c^2, where c is the distance from the old origin (the overall data average) to the new one (the average during the calibration interval).

(”New” PC in red, “old” in blue)

This causes what used to be PC#4 to become PC#1. It does not cause old PC#4 \approx new PC#1 to become “hockey-stick” shaped; it already was. It doesn’t make old PC#4 \approx new PC#1 strong enough to be called “significant” in the sense that it’s demonstrably more than just noise; it already was. It certainly doesn’t make old PC#4 \approx new PC#1 correlate with temperature during the calibration and verification intervals; it already did.

It does increase the strength of old PC#4 \approx new PC#1, so that all the other PCs become relatively weaker. Because of this, the selection rules applied by MBH98 end up selecting only 2 PCs instead of 5. By using non-centered PCA, MBH98 reduce the number of PCs included in the analysis, while ensuring that the retained PCs include the one which has an average during the calibration interval which is notably different from its average outside the calibration interval. That was in fact the reason for their choice; to ensure that the analysis included the PC which gives the most information about differences between the calibration interval and the past. It’s important to note that this doesn’t select for hockey sticks; the difference between the calibration interval and the past could equally well have been one which indicates the past was warmer than the present, or that there had been a steady trend since 1400 (the start time for the MBH98 analysis), or indicated a giant excursion (either positive or negative) in temperature at some time in the past. As it turns out, none of those other cases hold; the eigenvalue with the most difference between past and present happens to look like a hockey stick.

Therefore in the case of MBH98, the real effect of using non-centered PCA is not to impart a hockey-stick shape to one of the PCs (it was already there) or to move that PC up into the “significant” region (it was already there), it’s to reduce the included PCs from 5 to 2, effectively excluding 3 of the PCs which would have been included by the selection rules if centered PCA had been applied.

This choice didn’t create patterns that didn’t already exist. It didn’t make existing patterns correlate with temperature that didn’t already do so. It does remove possibly relevant information from the analysis. That too is a valid choice; reducing the number of PCs we use will reduce the impact of noise on the regression, but however the choice is made it may remove relevant information. In my opinion, it might have been better to try both choices, using non-centered then centered PCA, including 2 then 5 PCs, and test which gives the best match between reconstruction and observed data during the validation interval. As has been amply demonstrated, both choices give essentially the same result for the temperature reconstruction: a hockey stick.

The two PCs selected by MBH98 then become two of the proxies used to calibrate a relationship with temperature. They’re not the only ones, in fact for the time span 1400 to the present they’re two of twenty-two proxies available. Interestingly, MBH98 doesn’t restrict itself to a single analysis from 1400 to the 20th century. In fact they do 11 separate analyses, based on 11 different “proxy networks.” More proxies are available as time goes on, so that for example for the time span from 1450 to the 20th century there are 25 proxies used, from 1500 onward there are 28 proxies, etc., up to the time span 1820 onward which includes 100. Because there are so many more proxies available for later time spans, there’s enough data to estimate not only hemispheric temperature changes, but the geographic distribution of temperature changes. Hence much of the analysis of MBH98 amounts to what is called a climate field reconstruction, or CFR. But for the longest time span, beginning in 1400, there are only 22 proxies used, and there’s not enough information to reconstruct geographic patterns; such reconstructions fail verification tests. The hemispheric reconstruction starting in 1400 does pass verification.

And frankly, that’s the real test of whether or not a reconstruction may be valid or not. If it passes verification, that’s evidence that the relationship between proxies and temperature is a valid one, and that therefore the reconstruction may well reflect reality. If it fails verification, that’s evidence that the reconstruction does not reflect reality. It has the drawback that the data we set aside for verification we must omit from calibration; with less data, the calibration is less precise. But without verification, we can’t really test whether or not the reconstruction has a good chance of being correct.

The recent research of Wahl & Amman has explored many of the claims about the MBH98 hockey stick, and alternatives suggested by MM, paying close attention to verification. Their conclusion is: choices which pass verification show a hockey stick. Those choices that don’t show a hockey stick fail verification.

Having extolled the virtues, even the necessity, of verification, I’ll now report results for some multiple regressions I did using the 22 proxies in the “1400 proxy network” from MBH98, and some subsets of those 22, without bothering to perform verification. I’ll also regress northern-hemisphere temperature from NASA GISS against those proxies, a different choice from that of MBH98. Hence the calibration interval runs from 1880 to 1980 (when the proxies end), and there’s no verification interval. I’ll also utterly ignore issues such as area-weighting the input data to avoid giving too much emphasis to data-dense areas. This is just exploratory, to have a little fun; when exploring data non-rigorously for a blog post rather than a peer-reviewed publication, I can do it however I want to. Besides, these choices will give the folks at CA plenty to complain about.

First, here’s the result of regressing NH temperature from GISS against all 22 proxies using multiple regression:

fit22.jpg

The fit is excellent — but of course, without verification it’s not possible to say whether it’s truly meaningful or not. Anyway, here’s the reconstruction from 1400 to 1980 using this relationship:

recon22.jpg

And there’s the hockey stick, which is no surprise at all.

Of the 22 proxies in the “1400 proxy network,” two of them are the PCs selected by MBH98 to summarize the North American ITRDB: proxy #16 is PC#1 and proxy #17 is PC#2. Proxy 16 looks decidedly like a hockey stick (shown is the normalized proxy, which has the same shape):

proxy16.jpg

But it’s not the only one. So too does proxy #9, the “treeline11″ proxy (again, normalized):

proxy9.jpg

In addition, there are various linear combinations of proxies which have a hockey-stick shape.

If we omit the two proxies corresponding to the North American ITRDB, and regress temperature against the remaining 20 proxies, we get this:

fit20.jpg

recon20.jpg

Even excluding the disputed PCs, we still get a hockey stick. We can also do a principal components analysis on the 22 proxies in the 1400 network, to reduce further the number of inputs to the regression; for this I did a plain vanilla centered and normalized PC. This gives 6 PCs which are clearly above the noise level, and regressing temperature against those 6 once again indicates a hockey stick.

fitpc6.jpg

reconpc6.jpg

Frankly, there are a lot of hockey sticks coming out of this analysis. It’s all non-rigorous, it doesn’t area-weight anything, none of it is subjected to verification. But it surely shows that hockey sticks don’t depend on the tree ring data under dispute; there’s more than one way to stick it to a puck.

For a rigorous analysis of a variety of choices, both in the way the analysis is done and the proxies included in the analysis, one should study the work of Wahl & Amman; it shows, in my opinion convincingly, that the hockey stick is not only a correct result of MBH98, it’s a genuinely robust result.

Categories: Global Warming · climate change · mathematics

147 responses so far ↓

  • TCO // March 19, 2008 at 3:44 am

    Boo.

  • MrPete // March 19, 2008 at 4:06 am

    Tamino, to help visualize what is actually happening here, how about thumbnail-graphing the contribution of each proxy to the result? From what I’ve seen in the past, it would be an enlightening and important study. After all, isn’t the goal is to better understand the data and its physical meaning?

  • nanny_govt_sucks // March 19, 2008 at 4:48 am

    By using non-centered PCA, MBH98 reduce the number of PCs included in the analysis, while ensuring that the retained PCs include the one which has an average during the calibration interval which is notably different from its average outside the calibration interval. That was in fact the reason for their choice;

    Where in their description of “conventional PCA” was this reason and this choice explained?

    to ensure that the analysis included the PC which gives the most information about differences between the calibration interval and the past.

    And that PC was asserted to be a temperature PC, but on what basis?

    [Response: The "reason and choice" was stated by Mike Mann in a post on RealClimate; whether or not it was adequately explained in the original publication I can't say.

    And "that PC" was not "asserted" to be temperature -- but it *turned out* to correlate with temperature during the calibration and verfication intervals.]

  • fred // March 19, 2008 at 6:45 am

    Confirms one’s view that the only way to get to the bottom of this is by learning R and running it yourself.

    Even when that’s done, however, we are still missing any account that a programmer could use to write code to do decentered PCA. A recipe, an article, some specific case studies of use other than MBH. Some published account that would tell you (1) what the criteria for use of this method are (2) how you pick the “the origin of our multidimensional space” if you are not using the averages of the data series.

    Until someone points me to that, its going to be very hard to do decentered PCA, even with R.

    On its legitimacy, I’m prepared to believe that decentered PCA is a legitimate statistical technique. It seems very unlikely, but this is not my field. What would persuade is evidence that other people in other fields use it, and how. If its only ever been used by MBH 10 years ago, well, no, not buying.

  • Nexus 6 // March 19, 2008 at 8:25 am

    Good job as usual, HB. I think you can take it as granted that CA and sundry denialists will ignore most of what you have written, but in particular will not even allow your last sentence to transmit from their eyeballs to their brains.

  • MarkR // March 19, 2008 at 10:02 am

    Tamino says: “the eigenvalue with the most difference between past and present happens to look like a hockey stick”

    But surely the eigenvalue with the most difference…will always look like a Hockey Stick?

    [Response: I don't see that at all. Take *any* shape whatever -- check-mark, straight line (with nonzero slope), giant semicircle -- you can find a time at which the "present" average is very different from the "past" average.]

  • Gavin's Pussycat // March 19, 2008 at 12:17 pm

    Tamino, there is one thing that still puzzles me. In the centered computation, the PC we’re interested in is #4, and it is significant. Apparently #1…#3 are even more significant. But where do they go? One of them may have become #2 in the non-centered solution (?). But where did the others go?

    What I mean is, if they were retained in the centered solution, doesn’t that suggest that they represent something real, physical? And if they are discarded in the non-centered solution, doesn’t that suggest that they do not?

    Or is there a broad “grey zone” of PCs that may or may not be significant, and the selection rule a blunt instrument sensitive to details in the processing?

    Doesn’t this mean that the centered and uncentered solutions, while practically equivalent, are not formally identical?

    [Response: The first 3 PCs from centered analysis become (nearly) #2, 3, and 4 in the non-centered analysis. They're still "significant" in the sense that they're demonstrably not just noise, but *all* the PCs (in fact, all the proxies) may well be not just noise. When you omit PCs (or raw proxies) you reduce the impact of noise, but you risk losing relevant information.

    That's why I think it would have been better to try various combinations of PCs, and use the verification step to decide which to include and which to omit. That doesn't make the MBH98 choice invalid, but perhaps it wasn't the best choice. It's well to bear in mind that there are a *lot* of proxies in the raw data, so many that overfitting will be a real problem. And those PCs we promote to the final analysis become proxies themselves, so it's desirable to limit their number strongly. If you feel that keeping 5 for the N.Amer. ITRDB would be a better choice than keeping 2, I might agree, but I wouldn't say keeping only 2 is "wrong." If omitting the other 3 PCs invalidated the analysis, then it wouldn't have passed verification.]

  • P. Lewis // March 19, 2008 at 1:41 pm

    Refereed papers that have used NC-PCA include:

    Noy-Meir 1973
    Feoli 1977
    Carleton and Maycock 1980
    ter Braak 1983
    Dijksterhuis 1993
    Calenge et al 2005
    Burgherr et al. 2002 (though not apparent from the abstract)
    Calenge 2007

    There are some uses within the banking/financial sector literature, too (e.g. Di Bartolomeo and Marchetti 2004), but at least some of this seems to be bank journals, which though of generally high standard (it seems) are possibly not refereed in the “usual science mode” sense (but am willing to be corrected).

  • TCO // March 19, 2008 at 2:02 pm

    Having a lot of series does not make overfitting a problem. Having more terms in the solution gives a danger of overfitting.

  • Patrick Hadley // March 19, 2008 at 3:32 pm

    Nexus 6 tells us that he expects CA to ignore most of what Tamino has written. Wishful thinking perhaps. I suppose he could be right, but I will be surprised if this does not come in for the usual treatment.

    CA has produced a series of detailed criticisms of the previous offering - the majority of which have been studiously ignored by the “Bulldog”.

    As for Wahl and Amman, that has been exhaustively dealt with on CA, which can be found easily from a link under that name on the front page.

  • James // March 19, 2008 at 3:42 pm

    The new-origin vector is nearly parallel to PC#4 (the eigenvector with the 4th-strongest eigenvalue) from centered analysis.

    Can you expand on this point? Is this:
    a) a result of some choice that MBH made?
    b) an inevitable property of this data set?
    c) serendipity?
    d) other?

    [Response: I guess I'd say it's happenstance (rather than serendipity).]

  • sod // March 19, 2008 at 4:07 pm

    [Response: I don’t see that at all. Take *any* shape whatever — check-mark, straight line (with nonzero slope), giant semicircle — you can find a time at which the “present” average is very different from the “past” average.]

    Tamino, i think people need a simple example.

    (i know that if they get one, they will claim that it is different with climate data, but at least it will kill that stupid “always hockey stick form” argument)

  • L Miller // March 19, 2008 at 4:30 pm

    Tamino says: “the eigenvalue with the most difference between past and present happens to look like a hockey stick”
    But surely the eigenvalue with the most difference…will always look like a Hockey Stick?

    My understanding is as follows. The non-centered approach does favor PC’s that show 20th century warming (or cooling). You can call this “selecting for hockey sticks” if you want but it doesn’t change the fact that we know a priori that there is a warming trend in the 20th century. Selecting for PC’s that show this is therefore not a bad thing because any PC that doesn’t meet this would have been filtered out by the multiple regression regardless of where in stood after the PCA.

    To my thinking, however, favoring PC’s showing warming trends in the 20th century is very different then selecting for hockey sticks. What is being pre-selected for is a 20th century rise in temp. What makes a hockey stick pattern it’s *steepness* of that rise in comparison to previous changes. AFAICT the steepness of the various rises and falls are unchanged by the centering, they come purely from the source data. While it is possible that a PC with an overly steep 20th century rise will get promoted these will get culled by the multiple regression just as the PC’s that don’t show a 20th century rise would have.

    I.E. if you are saying “selecting for a warming 20th century” = = “selecting for hockey sticks” then selecting for hockey sticks isn’t a bad idea given what we know going in. It seems to me, however, that the definition of “selecting for hockey sticks” is changing mid stream. It’s going from “favoring PC’s with a 20th century rise/fall” to “automatically showing an abnormally steep rise/fall”. The former statement is true but irrelevant for this problem, the latter would be relevant but isn’t true.

  • JimV // March 19, 2008 at 4:38 pm

    Yay!

    Just from the brief description of PCA given here, independent of the previous posts, I could do my first PCA - but then I had a Linear Algebra class in college, and have done many vibration analyses which used eigenvectors.

    Any good math package (MATLAB, Cygwin, Mathematica, LINPAC, IMSL, etc.) will include functions to give the the eigenvectors and eigenvalues of a matrix. For small matrices you can calculate the terms of the characteristic polynomial by hand and solve it to get the eigenvalues, then solve some linar equations to get the eigenvectors. For large matrices like the ones here, you need a math package.

  • None // March 19, 2008 at 8:09 pm

    How can you calibrate proxies to an overall trend though, it makes no sense. You end up with the strongest weighting on proxies which best match the overall trend but don’t even match their own local temperature trend.

    How can things which don’t match their own local temperature trends get an INCREASED weighting - it indicates that they are worse proxies, not better proxies.

    Btw, JimV, Cygwin is a (great) unix emulation library (with assorted recompiled unix utils) for windows, not a maths package.

  • dhogaza // March 19, 2008 at 8:21 pm

    You end up with the strongest weighting on proxies which best match the overall trend but don’t even match their own local temperature trend.

    How can things which don’t match their own local temperature trends get an INCREASED weighting - it indicates that they are worse proxies, not better proxies.

    Has it ever dawned on you that the people over at CA might be WRONG when they make that claim about the BCP proxy?

  • Gavin's Pussycat // March 19, 2008 at 8:26 pm

    James:

    The new-origin vector is nearly parallel to PC#4 (the eigenvector with
    the 4th-strongest eigenvalue) from centered analysis.

    Can you expand on this point? Is this: …

    James, I believe the proper explanation of this is, that the period used for calibration also happens to be the period for which anomalous warming is being established.

    The calibration period having systematically (in all the time series used) a higher temperature than the rest of the millenium produces systematically positive elements in the c_a vector. And this same pattern, of the 20th century being systematically warmer than previous centuries, imprints itself (more or less) on most all the proxy time series used, producing systematically positive elements inthe interesting eigenvector too. And the two patterns will be rather similar looking: a proxy that “swings out” a lot during the 20th century will produce both a large element in c_a and a large element in the interesting nu_a. Meaning in vector language that the two vectors are nearly “parallel”.

    Consider the simplified case of all the proxy time series being clones of the same hockey stick. Then the covariance matrix Z_ab = (Z_a Z_b)/N will look like a square containing the same constant for every element. You can easily show (matlab/octave/scilab) that such a matrix has only one nonzero eigenvalue lambda, with the corresponding eigenvector being [1 1 1 ...1 1]‘ / sqrt(N).

    Now also the elements of c_a will be all the same, and thus the product matrix c_a c_b will look like c^2 times a square matrix with all ones. Again having one nonzero leading eigenvalue c^2 with the same eigenvector as above. The single nonzero eigenvalue for the non-centered problem will then be exactly lambda + c^2, as tamino claims above to be roughly valid for the real-life case.

    It is no iron law that the calibration period should be systematically warmer than the rest of the record. Consider the situation where temperature is constant 1400-1900, but oscillates sinusoidally over the calibration period 1900-1980, so that the average over the calibration period is equal to that over the preceding period. In this case, the computed c_a would be a null vector, but no doubt the PCA method would neatly recover a “sine-bladed hockey stick”… in this weird Gedanken-Experiment, the central and non-central methods would be identical.

  • None // March 19, 2008 at 9:23 pm

    dhogaza:
    “Has it ever dawned on you that the people over at CA might be WRONG when they make that claim about the BCP proxy?”

    My point is one of procedure. How can you logically assign weightings to how well a proxy matches an overall trend when none of the proxies are actually measuring the overall trend but individual and localised climate swings.

    The weightings should be made according to how well the proxies match the LOCAL records, not how well they match the overall trend because there can be no possible way that there is a physical mechanism with which a tree in one part of the country can match the trend of a continent. Giving it a heavy weighting just because it does match the overall trend (when it is completely possible that it can be utterly useless as a local temperature proxy) is quite crazy. It makes no sense.

    It seems to me this is a fundamental problem with this way of generating historical temperatures from these proxies.

  • cthulhu // March 19, 2008 at 10:03 pm

    “How can you logically assign weightings to how well a proxy matches an overall trend when none of the proxies are actually measuring the overall trend but individual and localised climate swings.”

    What if the localised climate swings for a particular region largely mirror the overall trend?

  • Nexus 6 // March 19, 2008 at 10:14 pm

    Patrick, CA has gone off on all sorts of tangents without directly confronting what is in this series of posts. Then they use the rather silly tactic of proposing a million different equally lame criticisms and challenging HB to immediately rebut them all. If he doesn’t, it’s taken as proof that he either supports them or is too awestruck by their superior greatness to have any comeback. The third option, that they’re a load pointless stupid not worth bothering about, isn’t ever considered for some reason.

  • dhogaza // March 19, 2008 at 10:15 pm

    My point is one of procedure. How can you logically assign weightings to how well a proxy matches an overall trend when none of the proxies are actually measuring the overall trend but individual and localised climate swings.

    No, that’s not what you said. You said …

    How can things which don’t match their own local temperature trends…

    Which is quite a different statement entirely. If there’s no correlation between the local variation in temperature and the width of a tree’s growth rings, then obviously it would be (as you claimed) useless as a proxy for temperature.

    The weightings should be made according to how well the proxies match the LOCAL records, not how well they match the overall trend because there can be no possible way that there is a physical mechanism with which a tree in one part of the country can match the trend of a continent

    They’re working with sets of trees, not individual trees, I do believe …

  • None // March 19, 2008 at 10:39 pm

    cthulhu:
    “What if the localised climate swings for a particular region largely mirror the overall trend?”

    Under that happy coincidence you would get a good result despite your science being bad. Bad science giving the right result is still bad science.

  • None // March 19, 2008 at 11:02 pm

    dhogaza:
    “No, that’s not what you said. You said …

    How can things which don’t match their own local temperature trends… ”

    Yes, proxies which dont match their local temperature trends but match the continental trend will get more heavily weighted, despite being bad proxies and conversely if theres a very good proxy for an area where the area does not have a good match to the national trend then it gets a very poor weighting.

    I can see why you jumped to the conclusion about what I meant because I stressed that you could get bad proxies being heavily weighted though (if you ignore my first sentence which is pretty clear I think).

    “Which is quite a different statement entirely. If there’s no correlation between the local variation in temperature and the width of a tree’s growth rings, then obviously it would be (as you claimed) useless as a proxy for temperature.”

    Yes it would be a poor proxy for temperature, but it would still get a heavy weighting.

    “They’re working with sets of trees, not individual trees, I do believe …”

    I was aware of that, but can’t see how that makes a difference, they are still fairly localised.

  • Patrick Hadley // March 19, 2008 at 11:20 pm

    Nexus, as far as I can see CA has not issued the Bulldog with a million challenges, just one. He is perfectly free to either reject it or rebut it - but to ignore it does not seem very Bulldog-like.

  • luminous beauty // March 20, 2008 at 12:07 am

    None,

    What you are missing is that these trees, while being highly correlated to their individual locales, have also been demonstrated to be highly sensitive to global signals such as ENSO and volcanic particulates. I wouldn’t be surprised, given their altitude, if they show some correlation to sunspots, as well.

  • DocMartyn // March 20, 2008 at 12:29 am

    If I were to give you 18 different proxies, all based on an underlying artificial temperature profile; would you be able to reconstruct it?

    I would be prepared to generate such a dataset.
    Could you PC method match the real data?

  • dhogaza // March 20, 2008 at 12:38 am

    Nexus, as far as I can see CA has not issued the Bulldog with a million challenges, just one. He is perfectly free to either reject it or rebut it - but to ignore it does not seem very Bulldog-like.

    I see no evidence that minds are open to change at CA, personally I prefer that HB just continue with his informative posts rather than wallow in that morass, or allow them to flood the threads here.

  • dhogaza // March 20, 2008 at 12:47 am

    What you are missing is that these trees, while being highly correlated to their individual locales…

    None seems to miss even that basic point:

    Yes, proxies which dont match their local temperature trends but match the continental trend will get more heavily weighted…

    If there wasn’t evidence of their matching their local temperature trends THEY WOULDN’T BE PROXIES IN THE FIRST PLACE.

    Now, you can attack the work that has caused a particular proxy to be proposed and generally accepted by the scientific community, but …

    It has NOTHING to do with statistical analysis, which is the supposed subject of this post.

    We really don’t need to turn this thread into another “but the proxies are bad” morass, do we, None?

  • L Miller // March 20, 2008 at 2:27 am

    If I were to give you 18 different proxies, all based on an underlying artificial temperature profile; would you be able to reconstruct it?

    I would be prepared to generate such a dataset.
    Could you PC method match the real data?

    Given that it’s relatively easy to construct a data set that isn’t accurately reflected by the proxies for it that would be a rather silly bet. All you would end up showing is that it’s possible for the proxies to be misleading, but that’s already true of every single law of classical physics and it doesn’t change their standing.

    Science is fundamentally an inductive process, it’s *always* possible that it comes up with an incorrect conclusion. The correct process isn’t to suggest there is a possibility the data is misleading but to find some new evidence that contradicts it.

  • Mike // March 20, 2008 at 3:23 am

    Correct me if I’m wrong here (and I’m no expert), but I thought the real issue was that the non-centered PCA selects for shapes in the dataset where the calibration period is much greater (or less than) the data outside the calibration period, because they are they ones where the difference between centering on the whole mean versus centering on the calibration period mean makes the most difference.

    The point of these reconstructions is not to show that the temperature has gone up; that is established. The point is to explore the medieval-modern relationship to see if the current warming is unprecedented.

    If your PCA mines for shapes where the data in the calibration period (mostly modern) is greater than outside the calibration (mostly medieval) then your analysis is inherently biased towards concluding that the current temperature is unprecedented compared to medieval times.

    Regardless of whether the verification period statistics are improved or not by non-centered PCA, surely this would be inappropriate for an analysis exploring the medieval modern relationship.

    [Response: First let's clear up when "medieval" means. The medieval period starts in the 5th century and ends in the 15th or 16th, depending on the historian doing the choosing. Some date its end precisely in 1517 with Martin Luther's posting of the 95 theses, launching the Protestant revolution, others choose 1453 with the fall of Constantinople. If one insists on a precise year I prefer 1455 with the printing of the Gutenberg Bible, because in my opinion, without the printing press the renaissance was impossible, with it the renaissance was inevitable. In any case, the time period covered in MBH98 is mostly *not* medieval, and includes only the tail end of the medieval period, that part not generally included in the so-called "medieval warm period." So if you're looking to compare medieval to modern using the time frame in MBH98, you're barking up the wrong tree.

    Let's also be crystal clear that for most of the time period covered by MBH98, certainly from 1600 to the present, it's folly or obstinacy or both to believe that temperature was as warm as at present. As the National Academy of Sciences report on paleoclimate makes clear, the vastly greater number and variety of proxies available, and the many paleotemperature studies in the literature, leave no reasonable doubt that the last several centuries were decidedly cooler than modern times. So it turns out that reconstructions which *don't* show a big difference between the 20th-century mean and the 1400-1900 mean are certainly in error. MBH98 didn't know this at the time, but their choice ends up being fortuitous rather than erroneous.

    Let's say it again: looking for the difference between the 20th century and the preceding time is not "inherently biased towards concluding that the current temperature is unprecedented compared to medieval times." It's looking for the difference between the 20th century and the preceding time. And if MM had done the analysis correctly, they too would have included the same PC showing the hockey stick. There's no bias in favor of "unprecedented warming" in MBH98, but there's a rather obvious bias *against* it in MM. MBH98 don't exclude the 2nd PC from the North American ITRDB, which does *not* show much difference from the 20th century, they include that too. And the North American ITRDB PCs end up being only 2 of 22 proxies included in the 1400 network, most of which aren't based on tree rings and aren't reduced by using PCA.

    Despite all your attempts to discredit it, it's the verification statistics that are the real test of whether or not a reconstruction may be valid. Pass verification: probably valid. Fail verification: probably wrong. And as Wahl & Amman show, hockey stick: pass. No hockey stick: fail.]

  • Timothy Chase // March 20, 2008 at 3:43 am

    Patrick Hadley wrote:

    Nexus, as far as I can see CA has not issued the Bulldog with a million challenges, just one. He is perfectly free to either reject it or rebut it - but to ignore it does not seem very Bulldog-like.

    Maybe just one challenge, but quite irrelevant to what Tamino was discussing: non-centered PCA. Which, as we have seen, was not Mann’s invention, but something that has been around for decades — even if Steve McI was entirely unfamiliar with it. And Steve McIntyre has written several posts on what Tamino has done in this series — as well as a critique of Wahl and Ammann. But unlike Wahl and Ammann, Steve McIntyre’s post hasn’t seen peer review. I haven’t looked at it though — the quality of what he wrote regarding Tamino’s posts made me go, “Why bother?”

    Tamino could respond, but I suggest he consider the same question.

  • EliRabett // March 20, 2008 at 4:24 am

    Eli has a new take on the BCP’s which is worth thinking about

    Thanks for using the entire interval for calibration. I’ve been advocating that for years.

  • qwertyui // March 20, 2008 at 4:27 am

    There is good presentation from Finnish Lappland dendro studies, it’s in finnish but couple of figures should be understandable:
    http://lustiag.pp.fi/Puiden_vuosilustot_ja_ilmasto6.pdf

    Surprise: No Hockey Stick!

  • None // March 20, 2008 at 10:52 am

    dhogaza:

    ” What you are missing is that these trees, while being highly correlated to their individual locales…

    None seems to miss even that basic point:”

    I hope luminous beauty was being sarcastic, because you can’t have a one dimensional proxy, ring width in this case, which is highly sensitive to multiple inputs (only one of which is local temperature) and then use it as a proxy for local temperature. Can you ?

    “If there wasn’t evidence of their matching their local temperature trends THEY WOULDN’T BE PROXIES IN THE FIRST PLACE.”

    Well that’s what you’re saying, but look at the Ammann and Wahl link referenced above, page 9: they acknowledge that the bristlecone/foxtail pines are proxies of precipitation rather than temperature, then wave their hands and say that since precipitation in the area is affected by the ENSO, and the ENSO affects regional temperatures, it must also be somehow effective as a temperature proxy.

    (And in this particular example after having been acknowledged as not being a good temperature proxy, they then get a stronger weighting because they do well in matching the regional trend over the callibration period: this is going down the alley that you originally accused me of, but is a side point).

  • EliRabett // March 20, 2008 at 12:30 pm

    WRT using the entire interval for training w/o validation, there should be enough new proxy series to use them for validation, moreover, they can be of shorter duration, since they only have to go back to ~1850 to validate against the instrumental temperature record.

  • kim // March 20, 2008 at 1:29 pm

    Bunnies can hop against the prevailing westerlies, but they shouldn’t be looking at their maps upside down, not just before Easter.
    =========================

  • luminous beauty // March 20, 2008 at 1:38 pm

    W&A, p.9:

    A further aspect of this critique is that the single-bladed hockey
    stick shape in proxy PC summaries for North America is carried disproportionately by a
    relatively small subset (15) of proxy records derived from bristlecone/foxtail pines in the western
    United States, which the authors mention as being subject to question in the literature as
    local/regional temperature proxies after approximately 1850 (cf. MM05a/b; Hughes and
    Funkhauser, 2003; MBH99; Graybill and Idso, 1993). It is important to note in this context that,
    because they employ an eigenvector-based CFR technique, MBH do not claim that all proxies
    used in their reconstruction are closely related to local-site variations in surface temperature.
    Rather, they invoke a less restrictive assumption that “whatever combination of local
    meteorological variables influence the proxy record, they find expression in one or more of the
    largest-scale patterns of annual climate variability”
    to which the proxy records are calibrated in
    the reconstruction process (Mann et al., 2000). MM directly note the link between
    bristlecone/foxtail pines and precipitation (p. 85, MM05b), which is exactly the kind of largescale pattern registration that the MBH CFR method takes as axiomatic because large portions of
    this region are known to have important ENSO/precipitation teleconnections (cf. Cole and Cook,
    1998; Rajagopalan et al., 2000). Since ENSO has a strong role in modulating global
    temperatures as well as affecting regional precipitation patterns, a CFR method of temperature
    reconstruction can effectively exploit regional ENSO/precipitation teleconnections that register
    in proxy data.

    So where is the handwaving? BCPs are good regional proxies of temperature prior to 1850 and good proxies of ENSO. It’s a twofer.

  • Tom C // March 20, 2008 at 3:49 pm

    OK luminous beauty -

    Let’s continue this brilliant approach. Let’s update the BCPs to 2008 and repeat the analysis. Fine with you?

  • luminous beauty // March 20, 2008 at 4:08 pm

    Tom,

    And make the same empirical adjustment for divergence?

    Sure.

    We might find a pinch. That would be interesting.

  • dhogaza // March 20, 2008 at 4:23 pm

    Let’s continue this brilliant approach. Let’s update the BCPs to 2008 and repeat the analysis. Fine with you?

    And, of course, you’ll say “there’s no evidence for CO2 fertilization affecting recent BCP growth”, while elsewhere in the deniosphere others will chant “CO2 fertilization is a boon to agriculture”, right?

  • None // March 20, 2008 at 5:02 pm

    luminous, you bolded the section:
    “exactly the kind of largescale pattern registration that the MBH CFR method takes as axiomatic”

    Do you understand that when something is taken as axiomatic, no attempt is made to prove it ?

    That is the handwaving:
    1) BCP’s are precipitation proxies
    2) Precipitation in the region is affected by (but not dominated by) ENSO
    3) ENSO affects national climate (to some extent)
    4) BCP’s are therefore adequate regional temperature proxies

    This sequence of arguments is treated as axiomatic, you emphasize it, and don’t consider it handwaving ?

    Anyway as Tom C says, more recent tree-ringing is seems to be revealing significant divergence of these “teleconnecting” proxies, or complete removal of the anomalous 20th C rise (cf Ababneh).

  • Chris O'Neill // March 20, 2008 at 5:08 pm

    Well that’s what you’re saying, but look at the Ammann and Wahl link referenced above, page 9: they acknowledge that the bristlecone/foxtail pines are proxies of precipitation rather than temperature

    If anyone actually bothers studying MBH98 in detail then they will find out that some of the proxies are explicitly *precipitation* indicators. e.g. Quelccaya Ice Core 2 ice accumulation which is the 6th-highest-weighted independent proxy. As a local temperature proxy its correlation is only -0.059 for 1902-1980.

    There is nothing wrong with using rainfall proxies to infer widespread climatic effects e.g. El Nino.

  • RomanM // March 20, 2008 at 5:09 pm

    I am disappointed! I expected to see an exposition of the mathematical rationale for the use of otherwise-centered “PCs” of MBH98, but the presentation fell short of doing that. Since none of your commenters addressed the mathematics in the post, let me offer some comments in that direction.

    It’s customary for variation to be computed from the origin of this data space, which is chosen as the average value of the data, in which case we replace our original data values ….
    These zeroed data series have the useful property that their average values over time are all zero…

    The use of the mean of the data is not an arbitrary choice - it is a well-known measure of location. The later mathematical properties of the eigenvalues and eigenvectors which are used in choosing the number of eigenvectors to use in an analysis, etc. assume that this centering was used when the analysis is being done. If the center has been changed, all bets are off until a valid mathematical justification is given explaining the effects of that change. That explanation is what I was looking for.
    Next, you gave a derivation of the matrix which was to be calculated and its relation to the covariance matrix of the data. Your end result was correct (and it was exactly what we had already agreed it to be in the previous PC thread). The new matrix is equal to the covariance matrix + c c’ where c’ is the transpose of the “new” centre vector c. Your notation was confusing. At times, you appear to be using c_a and c_b as scalars and then later as vectors (where as vectors, they were actually the vector and its transpose). However, the end result I will agree is as given. One thing that you did NOT stress however is that the vector c is equal to the difference between the averages of the variables over the short interval and the averages over the entire time period.
    It was at this point that I thought a mathematical case would be made. For example, one thing you might have tried is to use the fact that the eigenvectors of each matrix form an orthogonal basis for the data space and then perhaps you could express one set of eigenvectors as a linear combination of the others and use this to examine their relationship. Instead, you chose two unlikely cases to look at: when an eigenvector is perpendicular to and when it is parallel to the arbitrary center chosen earlier. Now, there is absolutely NO mathematical reason in this situation which would even suggest that this is to be expected in any way, shape, or form. As well, I cannot conceive of a valid physical reason in the data which would naturally lead to either possibility. It it were even approximately true, it would have to be a complete coincidence, an accident! Unfortunately, the rest of the post deteriorates into an arm-waving sequence of “nearly”s without any more genuine mathematical or statistical content. The justification that Prof. Mann’s “PCs” somehow fit into this unlikely coincidence (…then a miracle happens…), includes no solid statistics to show either “nearly” parallel or “nearly” perpendicular to the vector of differences c (remember, these are the differences in the means over the two different time periods - one of them arbitrarily chosen with regard to the original data). sorry, but I didn’t find that part particularly convincing.
    I do have a question which I will toss out at you. As you know, when you use the eigenvectors from a standard-centering analysis and you form the various principal components, these PCs are orthogonal (that’s the O in EOF). Based on that lack of correlation, we then state what percent of the variation of the data set is accounted for by each component and what percent we get when we combine some of them. What are the same correlations among, say, the first 5 PCs obtained from the other-centered analysis in MBH98? Are they all equal to 0? Are these “PC”s orthogonal?

    [Response: Both the centered and non-centered matrices are real and symmetric, therefore Hermitian, therefore their eigenvectors are orthogonal.

    As for the litany of "nearly," it's an attempt to communicate to the non-mathematician what is a genuine mathematical property: that if the deviation of the new-origin vector from one of the eigenvectors is small, so too will be the deviation of the new eigenvectors from old. That's as true as the day is long, and nowhere near controversial.

    As for expressing one set of eigenvectors as a linear combination of the others, both matrices (being real, symmetric, and positive-definite) define a complete set of orthonormal vectors which span the data space, so the eigenvalues of either can always be expressed as linear combinations of the other. They don't even have to be orthogonal, as long as they're linearly independent and as many in number as the dimension of the space, this will always be the case; it's a rather obvious result and doesn't really contribute to understanding.

    Niggling about notation, in a blog post written to communicate to the non-mathematician, seems to miss the point. Entirely. Those "in the know" will have no difficulty whatever divining my meaning; those not in the know won't be helped one iota by slavishly conforming to your opinions about notational conventions (which by the way, I consider to be inferior). As a wise professor once told me, notation should be your servant, not your master.

    Delving into the vagaries of the general case, in which all the eigenvectors change direction, would have led the reader into mathematical complications which likewise don't contribute to understanding the subject at hand. Dealing with the approximation which makes the case at hand easy to comprehend helps greatly. That this simplification applies, you may consider an "unlikely coincidence," but it still applies.

    As for your disappointement in my post, it's probably comparable to my disappointment with your comment.]

  • Eli Rabett // March 20, 2008 at 5:12 pm

    Actually, you can’t just update one record, you should do them all, which is why 1980 was chosen as a cut off because essentially all of the records chosen went out that far. And Kimmy, Eli is not only not a harmless little bunny” but he has a friend who is very fond of liver.

  • dhogaza // March 20, 2008 at 5:41 pm

    Anyway as Tom C says, more recent tree-ringing is seems to be revealing significant divergence of these “teleconnecting” proxies, or complete removal of the anomalous 20th C rise (cf Ababneh).

    And what has changed in the earth’s atmosphere that is thought to contribute to that, hmmm?

  • luminous beauty // March 20, 2008 at 6:05 pm

    None,

    “Do you understand that when something is taken as axiomatic, no attempt is made to prove it ?”

    It is axiomatic in that it is exactly the kind of evidence MBH are looking for. Do you understand how risible it is to demand they prove that what they are looking for is actually what they are looking for?

  • P. Lewis // March 20, 2008 at 6:10 pm

    Re dhogaza’s comment on Tom C

    Well, the easy answer is CO2, of course — which one often sees reference to, e.g. in regard to CO2 fertilisation.

    But I also wonder about the (any) effects of SOx, NHx and NOx.

  • None // March 20, 2008 at 6:37 pm

    dhogaza:
    “And what has changed in the earth’s atmosphere that is thought to contribute to that, hmmm?”

    Ababneh’s results show no abnormal 20th century, as well as the more recent period “divergence”.

    It implies that the original Graybill 20th century measurements were erroneous. (This would obviously destroy their high “teleconnected” weightings, since (even though the precipitation is still “affected” by ENSO) they would no longer match the regional temperature changes)

    Anyway, this is turning into a pointless climate study argument, so i’m not posting any more on this subject. Thanks to Tamino for expounding on an interesting mathematical subject.

  • Phil Howerton // March 20, 2008 at 6:45 pm

    You denigrate McIntyre as an “engineer” and McKittrick as an “economist.” What are your qualifications? And just who in the hell are you?
    You are certainly not a competent statistician. Either that, or you’re not an honest one.

    [Response: Who exactly are you talking to? I don't seem to find the words "engineer" or "economist" anywhere on this thread, except in your comment.

    Your final two sentences tell us a lot more about you, than anybody else here.]

  • RomanM // March 20, 2008 at 7:12 pm

    [Response: Both the centered and non-centered matrices are real and symmetric, therefore Hermitian, therefore their eigenvectors are orthogonal.

    You side-stepped my question. The eigenvectors are NOT the principal components. They are merely the coefficients used in calculating the PCs. The eigenvectors are numerically orthogonal for both matrices. No problem on that. For the actual covariance matrix, the principal components are uncorrelated with each other. Are you then saying that if I use the eigenvectors from the re-centered matrix to calculate the (Mann) "PC"s, these will also be uncorrelated among themselves? It should be a very simple matter for you to check this or perhaps you could indicate where I might find them so I could do it myself.

    As for the litany of “nearly,” it’s an attempt to communicate to the non-mathematician what is a genuine mathematical property: that if the deviation of the new-origin vector from one of the eigenvalues is small, so too will be the deviation of the new eigenvectors from old. That’s as true as the day is long, and nowhere near controversial.
    As for expressing one set of eigenvectors as a linear combination of the others, both matrices (being real, symmetric, and positive-definite) define a complete set of orthonormal vectors which span the data space, so the eigenvalues of either can always be expressed as linear combinations of the other. They don’t even have to be orthogonal, as long as they’re linearly independent and as many in number as the dimension of the space, this will always be the case; it’s a rather obvious result and doesn’t really contribute to understanding.

    I believe you missed the point. I was suggesting that you possibly could USE that result to examine the situation. If you express the e-vectors of the re-centered matrix as linear combinations of the covariance e-vectors, this is one possible approach to get a genuine mathematical handle on what "nearly" can mean AND what the effect of multiplication by the covariance matrix is on that vector. The non-mathematician might be surprised to learn that small changes when you are dealing with matrices and their inverses can have large impacts on the results so arm waving is counterproductive.

    Niggling about notation, in a blog post written to communicate to the non-mathematician, seems to miss the point. Entirely. Those “in the know” will have no difficulty whatever divining my meaning; those not in the know won’t be helped one iota by slavinshly conforming to your opinions about notational conventions (which by the way, I consider to be inferior). As a wise professor once told me, notation should be your servant, not your master.

    One of the things I have learned about teaching is that the person with the understanding and the proper background will always figure it out. The one that needs to receive the information in a clear and careful way is the one who has the least clue. I wasn't concerned with or referring to any sort of problem with notation or conventions.

    Delving into the vagaries of the general case, in which all the eigenvectors change direction, would have led the reader into mathematical complications which likewise don’t contribute to understanding the subject at hand. Dealing with the approximation which makes the case at hand easy to comprehend helps greatly. That this simplification applies, you may consider an “unlikely coincidence,” but it still applies.

    Please show me some statisitics which can justify this.

    [Response: I misinterpreted your question, but I didn't "sidestep" it. That strikes me as a rather snotty way to put things. It also strikes me as characteristic of your approach.

    If the new-origin vector is parallel to one of the eigenvectors, then the new principal components will be uncorrelated. If it's approximately parallel to one of the eigenvectors, the new principal components will be approximately uncorrelated. This can be derived from first principles; I'll leave it as an exercise for you.

    Your claim that "I wasn’t concerned with or referring to any sort of problem with notation or conventions" contradicts your earlier statement that "Your notation was confusing."

    Rather than "sidestep" your other questions, I think I'll just ignore you.]

  • fred // March 20, 2008 at 7:28 pm

    Timothy Chase writes that decentered PCA has been around for decades, and cites the link provided by P Lewis. I am checking out these as fast as other commitments permit, but have to say, the jury is still out on how many do use the Mannian method, and whether as a body whatever they do amounts to a endorsement of the Mannian method by practice.

    I do still think that if its a standard method, there should be readily available recipes for when and how to do it. If there are none, it must be one of the only statistical methods in use since the seventies where there are none.

    So far nothing like this. But maybe it is there, this stuff takes a while to work through. Will report properly when I get through.

  • dhogaza // March 20, 2008 at 7:52 pm

    Timothy Chase writes that decentered PCA has been around for decades, and cites the link provided by P Lewis. I am checking out these as fast as other commitments permit, but have to say, the jury is still out on how many do use the Mannian method, and whether as a body whatever they do amounts to a endorsement of the Mannian method by practice.

    You do understand that a cite of a paper published in 1973 makes it seem silly to call it “Mannian” given that it has been in use for at least 25 years before MBH98?

    And you do realize that doing so is, in the first place, a variant of our old friend, the ad hom attack?

  • Armagh Geddon // March 20, 2008 at 10:06 pm

    Chris O’Neill says: “There is nothing wrong with using rainfall proxies to infer widespread climatic effects e.g. El Nino.”

    My question to you Chris, is can you point me to references that demonstrate the relationship between precipitation and temperature? Is it a linear relationship?

  • P. Lewis // March 20, 2008 at 11:58 pm

    Re Armagh Geddon // March 20, 2008 at 10:06 pm

    Chris O’Neill can defend himself, but what he actually said was:

    Quelccaya Ice Core 2 ice accumulation which is the 6th-highest-weighted independent proxy. As a local temperature proxy its correlation is only -0.059

    which means a diddly squat correlation in this instance.

    What he also actually said was

    There is nothing wrong with using rainfall proxies to infer widespread climatic effects e.g. El Nino.

    And the last time I looked, precipitation patterns formed part of the panoply that is climate.

    Anyway, since you asked:

    Madden and Williams 1978 The Correlation between Temperature and Precipitation in the United States and Europe
    Siegenthaler and Oeschger 1980 Correlation of 18O in precipitation with temperature and altitude
    Zhao and Khalil 1993 The relationship between precipitation and temperature over the contiguous United States
    Kohn and Welker 2004 On the temperature correlation of δ18O in modern precipitation
    Tout 2006 Precipitation–temperature relationships in England and Wales summers
    Yang et al 2006 Correlation between precipitation and temperature variations in the past 300 years recorded in Guliya ice core, China

    And many, many others that may or may not be of relevance. HTH.

  • EliRabett // March 21, 2008 at 12:52 am

    NHx for all practical purposes is nill. NOx makes ozone which is a nasty for trees, SOx makes acid rain which does nasty things to trees. SOx depends a lot on which way the winds are blowing from coal burning plants.

  • luminous beauty // March 21, 2008 at 1:15 am

    El Niño/White Mts.[38N x 111W] —
    rel. temp. precip.

  • P. Lewis // March 21, 2008 at 2:02 am

    Yes Eli. I was perhaps a little obtuse in my terse comment.

    I was aware of the generalities (though I didn’t know NHx was practically nil).

    In particlar, I was really thinking of the nitrogen-based compounds (the sulfur-based pollutants I’d largely conclude from western European experience are detrimental anyway, as is the NOx component of acid rain of course — perhaps I shouldn’t have mentioned the SOx; don’t know why I did now).

    I knew that some work had been done with so-called CO2 fertilisation, having read around it in connection with strip-/full-bark BCPs issue. So, I was just wondering aloud that perhaps other aspects of pollution than CO2 need taking account of (but perhaps they have), since I’d not seen it mentioned in my reading with regard to BCP growth in the late 19th century and through the 20th. If the soils in which these strip-bark BCPs grow were nitrogen limited, I was wondering to myself whether there might ultimately be a fertilisation effect from such NOx compounds, especially as I remember reading something about x years ago about trees’ need for nitrogen to utilise carbon sources effectively.

  • Tom C // March 21, 2008 at 2:25 am

    luminous beauty wrote:

    Tom,

    And make the same empirical adjustment for divergence?

    Sure.

    We might find a pinch. That would be interesting.

    The whole point is that as you go back past 1900, there is no way of knowing which periods are “divergent” and which are highly correlated to global temperature, ENSO, or whatever bizarre notion someone dreams up. So, the whole thing is an exercise in proving what you want to be true.

    Repeat this over and over, guys: you can’t measure global temperature 1000 years ago to 0.1 C. It’s insane.

  • luminous beauty // March 21, 2008 at 12:18 pm

    Tom,

    I’m sure it is just coincidence that divergence occurs in sync with anthropogenic induced environmental changes.

    And that old uniformity principle: Toss it into the garbage. Magic is a much more fun explanation of the natural world.

    Thanks for the sanity. Now I can fly on a moonbeam.

  • TCO // March 21, 2008 at 2:40 pm

    “whether or not it was adequately explained in the original publication I can’t say.”

    You can too say. You can read it. You have read a lot of literature. You should have an opinion of what is good publication practice. Don’t abstain, make a call. I don’t like it when Steve M. gives Loehle or Watts a pass and I don’t want to see you do the same for Mike. It doesn’t matter if you like him or his politics. All that matters is what’s right.

  • fred // March 21, 2008 at 2:50 pm

    I looked at the references from P Lewis on non-centered or decentered PCA with a view to discovering whether (i) they use the same method as that used by MBH, (ii) whether they provide evidence that the MBH procedure or decentered PCA is a standard recipe, and if so in what circumstances, and how to do it if it is.

    The answer is, haven’t a clue. Yes, some of the summaries do mention something called decentered PCA, but its impossible to tell from them exactly what is meant by this. Then, is there a recipe for doing whatever MBH did in their paper? Mann did not reveal his algorithms to the committees, did he? So comparison and verification isn’t possible, at least not for me.

    The one paper which is available in its entirety gives a paragraph on what is being done, but unfortunately its beyond me. Its Calenge 2007. Page 3 gives a description of what is being done if anyone understands it.

    I’ve glanced at the R PCA package. If there is a method in R for this, not found it yet. This might be a real test - any legitimate statistical method seems to be in R.

    If the question was, have I seen evidence that the MBH methods have been commonly used since the seventies, and are well known statistical techniques the answer is, no. Not yet. Would I expect to find more and easier to grasp evidence if it were true? Yes.

    So, as they say in Scotland, much to the bemusement of their neighbors to the south, the answer so far is ‘non proven’.

  • TCO // March 21, 2008 at 2:56 pm

    P Lewis:

    Your list of papers does not seem to have any that are “short centered” (subtracted an average from part of the time series). They talk about non-centered (no subtraction) versus centered (subtraction of the mean). This was also the case with Tammy’s Joliffe powerpoint talk. This means we still don’t have a good discussion of the short-centered transform. Certainly we lack a theoretical exposition or review article of the plusses and minusses and treatment of SHORT-CENTERING as a method.

    Note also that the majority of these papers list the non-centering very much upfront (and the one paper that doesn’t still discusesses it reasonably within the paper). Some of the papers also run centered and non-centered and compare results. The implication is that people will want to know when the non-centered method is used.

    I WON’T be agnostic. I think Mann should have noted his transform in his publication. It’s unusual and people need to know when unusual methods are used. (And he should have done the analysis to see what difference the unusual method made rather than waiting for McI, Tammy, Wahl and Amman and himself in RC to figure it out later). I do care about good publication practices. People need to push themselves to do good work when putting things into the literature. And a key thing is being very descriptive. You can read Wilson’s classic book from the 50’s on methods of science to hear this described. Or read Feynman.

  • TCO // March 21, 2008 at 3:07 pm

    Tammy: Your second figure showing good match between the the 22 proxies and temp is promising. I can see that the proxies seem to go through about 3 wiggles, so that it makes me fell better than just a trend to trend match.

    Have one worry. What does it mean that you did “multiple regression”? Did you just take an average of the 22 series and plot them, or is the displayed proxy line some sort of weighted average GEARED to matching the temp curves…and thus the 3-wiggle match less impress. How many terms are in your multiple regression and how many degrees of freedom remaining?

    I understand multiple regression in DOE with variables inside a process (temp, composition, etc versus ice cream viscosity for instance) with the objective of creating an equation (multi-term transfer function) that describes how X1, X2, etc. produce Y. but don’t understand exactly what it means wrt time series analysis.

    [Response: Multiple regression means to determine, by the method of least squares, the coefficients which give the linear combination of predictors (in this case, the proxies) which best matches the predictand (in this case, temperature). The analysis offers no choices to the analyst; numbers go in, numbers come out, I have no control over what they are (except in selecting the inputs and outputs). It's a tried-and-true method, which isn't specifically related to time series.]

  • Chris O'Neill // March 21, 2008 at 3:11 pm

    whatever bizarre notion someone dreams up

    Synonym for “it’s gone over the top of my head”.

  • TCO // March 21, 2008 at 3:17 pm

    Tammy:

    When you discuss the parallelism and say that the the two vectors are “nearly parallel”, can you quantify how parallel they are? And also how to devitiations from parallelism affect things? How “near to parallel” does one need to be?

  • TCO // March 21, 2008 at 3:20 pm

    There’s some interesting stuff on the CA site. Unfortunately, SM is so caught up in the “prove other people wrong” “battle” that he can not discuss one issue at a time. Thus he and you end up talking past each other. And meta-PR debate battle stuff ends up messing with actually pulling things apart and understanding them peice by piece. I’m not saying this out of some “pox on both your houses” desire, but it’s honestly my assessment of the dialogue.

  • TCO // March 21, 2008 at 3:24 pm

    L Miller:

    One of my simple-minded concerns about short-centering promoting 20th century warming (or cooling, but PCs can be flipped) is that we may select out of a grab bag, those series that have this 20th century property, but that they will just average out to zero (noise cancellation) in the before years. (Am I communicating my concern?)

  • TCO // March 21, 2008 at 3:29 pm

    It’s POSSIBLE that local trees might pick up global signals through some teleconnection (prehaps precip), but it is worrisome. It’s one more way that we are essentially feeding a huge mess of records into a churning hopper and fishing around and hoping that we get the right answer. It promotes the concerns of data mining (nothing nefarious, just poor practice). It lowers the suppositions of physical rationale for the proxies. For instance why bother with upper treeline trees or careful site selection or all that sort of stuff that dendro’s site as things that they do to help isolate a temperature signal…because those are all related to the local signal and physical rationales for that!

  • TCO // March 21, 2008 at 3:33 pm

    Dghoza: the point on methodology remains. Even if the CA claims that bcps don’t match local records are false, the point is that the TRAINING METHOD used is versus larger areas, not versus local. So the METHOD is agnostic towards how well proxies match their local environment. Instead if fishes for how they match some larger area average. This is troublesomely non-physical. It might work, I guess. But it might also promote fishing for correlations without adequate physical rationale.

  • TCO // March 21, 2008 at 3:54 pm

    Not to get too much into the drama, Tammy, but you should continue to engage with Roman. He is one of the few people who is interested in the math and capable of discussing it. Would be good for the two of you to deal with each other.

    I don’t really think we was that agressive. About same as you, I’d say, and well less than Dogza.

    But in any case, I urge my fellow skeptics to take the very high road as I have seen that you will allow science comments to come through, but will also get a bit touchy about wrangling. (And my skeptics will get banned/ignored first. So we might as well not fight those wars.)

    Oh…and I don’t think he was making a big deal of the notation error. It was a point in passing…and you got touchy about it. However, for mega-dummies like me…please keep the notation clear. It is hard enough to try to strain one’s brain to follow these things, but if notation is funky that makes it worse. For instance, I hate it when Steve McI makes graphs that lack labels on the axes or have various errors in presentation. Or when he has meandering expositions. The complication of these algorithms is hard enough. Let’s not add to it with sloppy exposition (especially when the point is supposed to be teaching, outreach, general audience, etc. anyhow.)

  • TCO // March 21, 2008 at 4:09 pm

    I guess there is a difference in terminology. I’m used to seeing multiple regressions done with manual decisions on how many terms to use. For instance, let’s say I do a 4 factor, 4 level full factorial data run: x1, x2, x3, x4 for 4 levels of each. giving 64 resultant Ys (assume a single output variable). I could model this with up to 63 terms in the equation. But that’s not good practice. Instead I will typically model it with 4 linear terms and a few interaction or squared terms (which is really an interaction on itself) leaving LOTS of degrees of freedom in reserve. And the decision of how to model this will be done with manual input and trial runs. At least that’s how I’m used to doing it in my little knuckle dragger world.

    but in any case, knowing that you modeled the heck out of that set of proxies in order to GET them to match that line makes me feel a lot less good about them MATCHING that line. I wonder how the simple average would compare.

    [Response: Multiple regression is as close as you'll get to a "standard" fit to the data; frankly, I don't see how you get "modeled the heck out of that set of proxies."]

  • nanny_govt_sucks // March 21, 2008 at 5:40 pm

    It bears repeating, it bears repeating:

    For instance why bother with upper treeline trees or careful site selection or all that sort of stuff that dendro’s site as things that they do to help isolate a temperature signal…because those are all related to the local signal and physical rationales for that!

    Thanks TCO. Also, why average a set of tree rings from a particular site for any other purpose than to get a local signal?

  • L Miller // March 21, 2008 at 6:10 pm

    One of my simple-minded concerns about short-centering promoting 20th century warming (or cooling, but PCs can be flipped) is that we may select out of a grab bag, those series that have this 20th century property, but that they will just average out to zero (noise cancellation) in the before years. (Am I communicating my concern?)

    That’s an unavoidable possibility in any statistical analysis. For that matter it’s an unavoidable possibility in any inductive “proof”. (Proof being in quotes because in the truest sense you can’t prove things with inductive logic.) The scientific method employs inductive logic so if you want absolute proof you can’t use science and if you demand absolute proof science will never be able to provide it.

    What this means is that if you don’t like the conclusions you could in fact argue all day long that something isn’t proven, and you would be right. This would still be rightly called obstructionist because it doesn’t help move forward. Whatever possible weaknesses it may have the BCP data passes the tests we can give it. If we had competing contradictory data those weakness would play a big role in deciding which gains acceptance, but we don’t.

    That’s the real key here. The BCP data passed the tests it could be given by passing validation. Our best guess has to be that it’s significant. To counter that you need to come up with either data that passes validation and says something different some other positive evidence that doesn’t match up with the temperature before the instrumented period. (Even if that happened it would be moot since the BCP data has been superseded. The only reason I can see for even discussing it at this point is that some people want to use it as a strawman. I.E. if we can show this older weaker data is wrong then we have somehow proved the newer stronger data is wrong as well.)

  • RomanM // March 21, 2008 at 6:20 pm

    TCO:

    Tammy:<