HUGE Problem with Jevrejeva et al. Sea Level Data

Global sea level before the satellite era is estimated from individual tide gauge records, which are combined to reconstruct a global average. One of the reconstructions climate deniers love best is from Jevrejeva et al., and the reason is obvious: because it gives a result they like.

But there are problems with the methodology used by Jevrejeva et al. HUGE problems.


One that has been much criticized is the “virtual station method” for averaging large groups of station records. Rightly so; but that’s not the problem I want to discuss. The one that hasn’t gotten attention is with their method of aligning station records to form averages.

They use a variant of the first difference method. When I first learned of that method I thought it was ingenious, downright brilliant, probably the best way to align data. Now I think that it’s the worst.

We need to align station records when they have a different baseline. Here’s a sample of monthly data records which follow exactly the same trend, but have a different baseline:

The two records are plotted with different symbols and in different colors. The first (black circles) extends from January 1950 through December 1999, but the second (red triangles) doesn’t begin until 1975.

We’re interested in how the data have changed over time, so the difference in baseline means nothing, and with things like sea level it’s arbitrary anyway. The goal is to shift each record by a constant offset to bring them into the best alignment. Then we can average them, knowing that the inconsistencies caused by (possibly vast) differences in the baseline values are minimized.

The method I use is to align them so that the sum of the squares of differences from each momentary average is minimized. I’ll call this the least squares method, and when I apply it to these data I get an aligned and averaged reconstruction like this:

It looks as though the data, when aligned and averaged, don’t seem to show any trend — just random fluctuation. And that’s correct, because these are artificial data right out of a random-number generator, plus a constant offset (different baseline) for the reccord which starts in 1975 rather than 1950.

I can also compute annual averages, again showing no trend (because there isn’t one):

The least squares method works fine, and hasn’t introduced any false trend. That’s good.

What’s the first difference method? We begin by computing, for each record, its first differences. These are just the differences between each value and the preceding value. This transforms each time series into a time series of first differences. The usefulness is that if the data x are some signal v plus some unknown and arbitrary baseline b

x_t = v_t + b,

then the first difference operation eliminates that arbitrary baseline

\Delta x_t = x_t - x_{t-1} = v_t - v_{t-1}.

Now we can average the first differences themselves, and even if the baseline values b are different for different data series, it doesn’t matter because we’ve eliminated those by first-differencing.

Finally, we transform from “averaged first differences” back to “averaged value” by integrating, which is easily accomplished by computing cumulative sums. We have eliminated baseline values, so we don’t even have to choose an offset for each record!

Let’s apply that method to these two artificial data sets. It gives this:

Whoa! What’s going on? That doesn’t look right. If we compute annual averages of this combined data set, it just emphasizes that “it ain’t right.”


What went wrong?

If you “do the math” you’ll discover that we have not avoided applying an offset. Instead, there’s a “hidden assumption” of what they are. And it is: that the very first value of the “new” series (the one plotted as red triangles) is exactly equal to the value of the first series at that starting moment.

And that’s a very serious problem. If we use the least squares method, of course our estimated offset will be imperfect because of the noise in the data, but at least its average will be minimized because we take into account all the data. But with the first difference method, the offset, although never computed or even necessarily thought about, is actually asummed based on a single moment of time. All the statistical precision we get from averaging large quantities of data goes out the window — the error in the offset is based on a single value only. And since that includes the random noise in both data series, it’s even bigger than the random noise in a single value.

The offset error which arises is not a bias. It’s random, and its expected value is zero so it’s unbiased. The problem is that its variance is so high; we lose the “power of large numbers” that comes with the least squares method. Hence we can get large offsets when a new data record begins, and although they can go either way (as they can with the least squares method), they’ll probably be a lot bigger than the offset error from least squares.

If there are multiple records, the offset errors accumulate. Hence, even when they are all unbiased, the variance of the cumulative offsets keeps growing. It’s not good when adding more data makes probable errors get bigger.

Here, for instance, are ten data records, the first starting in 1950, the next in 1955, next in 1960, etc. up to the tenth starting in 1995 (all end at December 1999):

Here’s what the least squares method gives as aligned averages:

That looks good, since all these data sets are nothing but random noise (plus a different offset for each). Here’s what the first-difference method gives:

Here are annual averages of same:

Not only have we introduced multiple large offset errors, they have conspired to make an apparently very strong, but totally spurious trend. Not good. Every time a new data record enters or exits the set, there’s another offset error added to the mix.


Let’s try these methods on some real data. I took sea level data from PSMSL (the Permanent Service for Mean Sea Level) for stations in Florida, and identified which station records have at least 600 months’ data. That amounts to seven stations:

But we’ll have offset errors more often than you might expect, because when a data record is missing a value you can’t compute a first-difference. It’s like a station drops out, then re-enters later, and those events too contribute to offset error.

Before doing any alignment, I’ll remove the seasonal cycle from each data series.

Let’s compare the reconstruction using the least squares method in red, to that using the first difference method in blue:

For the first 30 years or so they agree perfectly (which is why the blue line is hidden by the red), because there’s only 1 data record covering that time span so there’s no offset error. But when new stations come and go, the differences become palpable; here in fact is the difference between the two reconstructions:

Not only are the difference sizeable, they have once again introduced a spurious difference in trend.


Jevrejeva et al. don’t actually use the basic first-difference method. Instead they compute differences between values 12 months apart rather than just 1 month apart, in order to eliminate the seasonal cycle. That’s fine, but it doesn’t solve the first-difference offset problem. If you “do the math” you find that it’s the same as using the first-difference method (with lag 1 month instead of 12), then computing 12-month moving averages.

And that is yet another problem, although when it comes to estimating trends it’s a minor one. The Jevrejeva et al. data are 12-month moving averages, although that eliminates the seasonal cycle it also introduces “wicked strong” autocorrelation in the data. If you analyze the Jevrejeva et al. data without being aware of this, you could easily reach a faulty conclusion by not taking the strong autocorrelation into account.

The bottom line is that the Jeverejeva et al. data are faulty in multiple ways because of multiple problems with their methodology. That doesn’t make them totally useless — but it does mean that they’re a bad indicator of what global sea level trends have been over time. Which is one of the reasons climate deniers love to use these data so much.


This blog is made possible by readers like you; join others by donating at My Wee Dragon.

18 responses to “HUGE Problem with Jevrejeva et al. Sea Level Data

  1. Methane madness

    The loons arguing against sea level rise are arguing against Newtons third law, as the sea heats up it expands thus sea level rises, for every action there is a equal and opposite action.

    • That’s not Newton’s third law. In fact, at certain temperatures the ocean would shrink as it warms up, but that’s not where we are.

      • Methane madness

        How is it not Newton’s third? As particles heat up vibration increases, kinetic energy increases, collisions increase thus particle density decreases.

      • Except it doesn’t always decrease. What could you be doing wrong?

      • Merely referencing Newton’s third law is a bit reductionist and not very insightful. As an approximately incompressible fluid, a scalar increase in temperature corresponds to a scalar increase in internal pressure. Higher pressure at all points means that the height of the water must increase, as that’s the only direction to travel. The height will increase until atmospheric pressure and pressure of the water pushing up are equivalent again.

        What happens at the molecular level is the increase in temperature shifts the speed distribution of particles (by definition), which leads to (generally) more energetic impacts, true, and which also leads to more momentum flux per unit area (that may be redundant, but I want to paint the picture in full) per unit time—which is the definition of pressure.

        Newton’s third law says that at an object interface, forces are zero-sum; the velocity of the (constant-mass) center of mass of any two-body system (really, any n-body system) cannot be changed by internal forces. However, this is a statement about systems at equilibrium; that is, it describes systems where the center of mass has already responded to an external stimulus. We’re trying to figure out why a system reaches a *new* equilibrium when an external force is applied; that is, why the center of mass accelerates up when the water molecules are “pushed” to go faster, as what happens when it heats up. This is an application of Newton’s *second* law of motion, en mass. The acceleration of the center of mass of a system is equivalent to the sum of forces on the components divided by the system mass.

        So, if the oceans heat up, Newton’s second law implies that the new equilibrium will be one of a higher water level. It stays at that level because of Newton’s first and third law, which (respectively) state that the center of mass, being at equilibrium with its outside, cannot change; and the center of mass, when only considering its internal forces, cannot change.

  2. Greg, you are thinking of pure water. Sea water with a salinity above 24.7 per mil always expands from freezing (density decreases with warming). Even minor expansion makes a difference when the average depth of the ocean is 3800 m.

    • It still is not Newton’s 3rd law–which merely applies to forces, not metaphors.

      • Methane madness

        The dipolar water molecule absorbs infra red radiation and starts vibrating which applies force to surrounding molecules.

      • Newton’s law merely states that of you have two objects, 1 and 2, if object 1 exerts a force of F12 on object 2, then the force object 2 exerts on object 1 will be equal in magnitude and opposite in direction:
        F21=-F12.
        It is sometimes interpreted metaphorically to mean anything from:
        1) Actions always cause equal and opposite reactions (true only if “action” is interpreted as force)
        2)What goes around comes around
        3) karma

        However, it asserts none of these things. All of Newton’s laws have to do solely with forces:
        1) asserts what happens when the force on an object is 0 (it continues to move with constant velocity)
        2) tells how to calculate net force on an object and how that relates to its acceleration
        3) asserts that an object exerting a force experiences and equal and opposite (direction) force

    • It doesn’t matter whether it’s pure water or sea water. As water gets close to freezing it expands, which is counter to what Methane Madness claimed. If Newton’s “law” was involved there shouldn’t be counter examples. Note that at its current temperature the ocean expands as it warms, but if it was at just the right temperature it would contract.

      Of course, he’s not wrong in saying loons are arguing about this, but if you’re using physics use it right.

  3. Thanks for another illuminating post, Tamino. I guess the good news is that this insight could provide guidance for constructing a better data set.

  4. Many thanks Tamino for your contributions, especially for
    https://tamino.wordpress.com/2018/01/20/2017-temperature-summary/

    because this process of extracting ENSO, volcano and solar effects is so interesting.

    In 2014 Santer et al. computed a residual estimate of about 0.09 °C / decade out of the RSS3.3 time series (RSS’ original trend: 0.124 / decade at that time).

    Could you please publish the trend estimates of your result for the seven time series? That would be great, thanks in advance.

  5. First difference practices are not always bad. For instance, if we want to average two data records (on the same anomaly base) with diverging trends, and one record is discontinued.
    In the last year with data from both records, 2016, record A is 2.0, and record B 1.0 C, and the average 1.5 C. In 2017 record A is discontinued and record B is 0.8 C. The average of A and B drops apparently by 0.7 C from 2016 because A is lost, something that in the merged record will look like an abrupt step change.
    I we instead average the first differences, the change from 2016 to 2017 will only be 0.2 C, and the merged record will look more seemless..

    A practical example would be averaging of UAH v6 TLT and v5.6 TLT. Those series has drifted apart, and v5.6 was discontinued after July 2017.
    I would prefer averaging of first differences, if the average-series should continue with v6 data only.

  6. Susan Anderson

    For “methodology” substitute “mythology”. It reads more honest.