I only recently found out that Albert Parker/Alberto Boretti and C.D. Ollier published a “Discussion” of my paper with Patrick Brown about the analysis of sea level time series. You can get your own copy of their paper here.

What, you may wonder, is Parker/Boretti and Ollier’s method to look for acceleration of sea level? It’s contained in equations (1) and (2) of their paper. The first is this:

This is just the OLS (ordinary least squares) estimate of the slope from fitting a straight line to data points *j* through *k*. It always worries me when scientific papers give this equation, because it suggests that the authors are so unfamiliar with something as simple as least squares regression that they feel the need to “explain” it to the reader.

But at least it’s correct. What Parker/Boretti and Ollier say about it is not:

If j=1 (the oldest record) and k is variable, then

SLR_{1,k}is the estimation of the relative rate of rise at the time x_{k}.

Let’s find out, shall we?

Consider some artificial data, defined by the simple equation , for 101 *t* values from 0 to 1. We already know what the slope is at each moment () and what the acceleration is (=2 for all times), and we won’t add any noise so the method should diagnose the slope without error.

What does Parker/Boretti and Ollier’s method say the slope is at each moment of time? This:

The solid line is their method’s slope estimate, the dashed lines are the 95% confidence limits according to OLS. Now let me add a red line showing the true slope at each moment of time:

Wha-wha-wha-what??? The Parker/Boretti and Ollier method got it wrong. At every moment of time. In spite of the fact that the data are noise-free.

That’s because their method does **not** give “the estimation of the relative rate of rise **at the time x _{k}**” (remember they’re using

*x*to represent the time, while I tend to use

*t*). It gives an estimate of the

*average*rate of rise over the entire time interval, not “at the time x

_{k}.” If we had to assign this to a moment of time rather than an interval, the only logical choice would be the mid-point of the time interval (for evenly spaced data), not the end (the time x

_{k}).

If we use their method to estimate the acceleration, we get the same value at all times — as we should — but it’s the **wrong** value. Their method estimates the acceleration is equal to 1, when we already know it’s equal to 2.

How bad is it, when their method of estimating slope and acceleration can’t get it right even for noise-free data?

Parker/Boretti and Ollier don’t actually have a lot to say about my and Brown’s paper. My guess is: the math was over their heads. Way over.

This blog is made possible by readers like you; join others by donating at Peaseblossom’s Closet.

Engineers (heavily represented in authors and reviews) and stats! Good Lord!

Reading the review history — http://www.sciencedomain.org/review-history.php?iid=837&id=33&aid=8091 — shows the rather appallingly low level of peer review this paper got.

They’re pretty good at confusing themselves.

I read a paper once that was comparing two measurements, and used the term “bias removed root mean square error (BRRMSE)”. They also discussed mean bias error (MBE), and root mean square error (RMSE). On close examination, I realized their BRRMSE was what I knew as the standard deviation of the differences. The total error (RMSE) is a combination of the mean error (MBE) and the standard deviation of the errors. They either didn’t know that, or invented a new term to make it sound more impressive. Didn’t matter which – the paper was junk.

Good stuff.

Not sure who “Boretti” is but wondered if you’d contacted them about their errors? I was thinking of sending an email to the email contact on the paper (Parker), to point him to here except that they might immediately put up the wall when confronted with the thinly veiled insult at the end. :(

[

Response:Read this.]Yes, you’d take marks off a high school physics students if they confused the average velocity with the velocity at a particular time.

Reading their paper, I would have interpreted their definition as meaning ‘the average trend up to that point in time’ – I don’t get any sense of them meaning it to be an instantaneous value. I think they can be accused of a certain carelessness in language but not much more than that. Your illustrative example shows the radical differences that can exist between instantaneous values and average values, but that is often the case.

As presented in their paper, the formula does, as you say, indicate a certain level of lack of familiarity with OLS which is a rather basic concept in statistics. As an engineer myself, I am intrigued about engineers and statistics and their apparent lack of familiarity with its foundational concepts. Have others experienced this and why is it so? How come engineers haven’t had widespread recourse to statistics over the years? I studied it in college,

so it is part of the basic engineering syllabus, and have, latterly at least, used it significantly in my work.

[

Response:I will again quote from their paper: “If j=1 (the oldest record) and k is variable, then SLR]_{1,k}is the estimation of the relative rate of riseat the time x.” It’s not ambiguous._{k}Tamino,

Formally, you are, of course, correct when they left out the word average in their definition (i.e. their definition would have been formally correct had they added the word ‘average’ which should have read ‘is the estimation of the average relative rate of rise…’). The absence of a single word in a definition would not normally result in my concluding that the authors completely misunderstood the concept (the more likely explanation being careless writing/editing – i.e. a good example of Occam’s razor). You have validly pointed out an error in their paper but seem to have jumped to the conclusion that they therefore didn’t understand the formula a little too quickly in my opinion.

[

Response:Go read their equation (2) and you’ll see that they used that very mistaken notion to estimate acceleration. Look at their graphs, where they plot values at the wrong times precisely because of their lack of understanding.I won’t bother to quote their paper *again* — it’s unambiguous. You’re just being obtuse.]Tamino – I’m getting a tad confused about what you are criticising here. I can assure you I am not being deliberately obtuse so maybe it’s an innate characteristic of mine! :(

You say above “If we had to assign this to a moment of time rather than an interval, the only logical choice would be the mid-point of the time interval (for evenly spaced data), not the end (the time xk).” However, in your jointly authored paper with Rahmstorf ‘Global Temperature Evolution 1979-2010’ published in 2011, you use the same technique and graph it in figure 6 not at the ‘mid-points’ as you say but at the start points of each interval (you refer to what they call ‘acceleration’ as a ‘change in the warming rate’ – effectively the same thing).

[

Response:The purpose of that graph was to compare the trend rates from various start years up to the (at the time) present. No claim was made that they were rates at a particular time at all, they were clearly average rates over intervals. Note that the time axis isn’t labelled as “time” (to suggest a momentary estimate) but as “Start Year.” Note also that the figure caption says “Trend rates]from various starting years to December 2010,” which rather makes it clear they’re not assigned to any single moment. Since the end time was the same for all cases, it seemed logical to plot them as a function of the start time. We showed that whatever start time one chose, the average rate from then until the “present” was within the uncertainty limits of the others.I was also surprised to see you using this technique of the change of linear trends as your article above seems to be very critical of it – definition error or not. The choice of start, end or mid-point to graph the different trends seems quite arbitrary and equally valid to me – but mid-point feels symmetrically the right one to me.

[

Response:I’m not critical of comparing linear trends to find changes, I’m critical of the fact that they bungled the job.]And they did so in most amateurish fashion (not like some of the subtler problems I discussed here). But there are big problems with Parker & Ollier’s way. By assigning their trend rate estimate to the final time, they then use differences to estimate accelerations — that’s why their method, applied to the above artificial data, give only half the true rate (if they had assigned it to the mid-point time, at least it would get it right for the noise-free artificial data). The other huge problem is that when you estimate the rate “at x

_{n}” the way they did, the estimates for x_{n}and x_{n-1}are not independent — the later estimate is based on n data values, n-1 of which are the basis for the previous estimate.From Parker’s 2012 paper (ref 8), which is one of the main two precursors to this paper, Parker is very clear that it is a method of analysis of linear trends of different lengths. The erroneous definition in his latest paper doesn’t appear. This leads me to believe that the mistake here is not confusion on the authors’ part re trends versus instantaneous values but writing/editing shortcomings. In that paper he does again graph the linear trends at the end points of the windows – the opposite choice of your’s and Rahmstorf’s.

[

Response:Look again at their equation (3). That’s where they use the assignment of a rate to a particular time (in this case the endpoint) to compute acceleration. And if you want to compare linear trend rates statistically, they should be based on independent data. It’s not a writing problem, their confusion makes their method unable to get the acceleration right even with noise-free artificial data following a perfect quadratic.And I’ll repeat that Rahmstorf and I did not plot linear trend rates “at a particular time,” we plotted trend rates over time spans, and made that clear. We did it to anticipate the denier argument that if you start your trend at such-and-such year there’s a “pause.” No choice does. It’s fine for that purpose, but I admit that for testing departure from linearity it’s a poor choice.]That’s hilarious. So much for peer review. I’m a shitty statistician, but even I know that the OLS value over a segment does not represent the value at the end of the segment.

I’ll say one thing for Parker/Boretti: Over the last few years he has been finding some of the most obscure third-rate journals I’d never heard of. Perhaps this is actually a good sign; it means that real journals are getting better at rejecting his trash.

There seems to be an unending supply of dodgy open-access journals that just want author’s money (and are embarrassing the whole field).