Sea Level Rise, Sea Level Lies

Larry Hamlin has shown us all, in a post at WUWT, how warped and twisted is the thinking of those who call themselves “skeptics” about climate change, but are actually deniers.

This one is about sea level rise. The headline reads “30 years of NOAA tide gauge data debunk 1988 Senate hearing climate alarmist claims.” The theme is that James Hansen’s 1988 testimony (when he warned of the danger of climate change) was “alarmist” and that NOAA data (about sea level, taken from tide gauges) over the last 30 years “debunks” something.

NOAA has updated its coastal tide gauge measurement data through year 2018 with this update now providing 30 years of actual data since the infamous 1988 Senate hearings that launched the U.S. climate alarmist political propaganda campaign.

What, you might wonder, does the last 30 years of NOAA tide gauge data debunk? According to Hamlin, it’s this:

In all more than 200 coastal locations are included in these measurements with more than 100 of these coastal locations with recorded data periods in excess of 50 years in duration. None of these updated NOAA tide gauge measurement data records show coastal location sea level rise acceleration occurring anywhere on the U.S. coasts or Pacific or Atlantic island groups.

So that’s it! Hamlin is claiming that data from the last 30 years fails to show any sea level rise acceleration, anywhere in the U.S. or island groups.

How did Hamlin determine that the new data (since 1988) fail to show acceleration? That’s a bit harder to figure out. He does give us a clue, however, by highlighting the longest tide gauge record we have:

The longest NOAA tide gauge data record is at the Battery, New York with a 162 year long measurement period. This location along with all other NOAA U.S. coastal locations show no sea level rise acceleration occurring over the past 30 years despite scientifically flawed assertions otherwise by climate alarmists.

Aha! So that is his “proof.” He says so!

Maybe you’re not convinced. But wait — he shows a graph!

Since Larry Hamlin puts such faith in actual data, let’s look at what this graph shows: the actual data.

Note that the top of Hamlin’s graph (it’s actually from NOAA, not from Hamlin himself) says the average rate of sea level rise during the entire time span is 2.85 mm/yr. What say the last 30 years’ data? This:

Hmmm… Note that the rate is considerably higher than it was over the entire time span. And, the difference is statistically significant! That means, that the rate over the last 30 years is demonstrably higher than it was, on average, prior to that.

What do we call it … ? … when the rate of increase gets higher … ? … what’s the name for that? Oh yeah! It’s called acceleration.

Here’s why I call this truly twisted: to disprove Hamlin’s claim of “no acceleration anywhere,” all you need is one tide gauge station, namely the one he himself chose to highlight, and the only part of that you need is the last 30 years, the time span he himself chose. He says “This debunks acceleration” when in fact, in the specific case he shows himself, “This proves acceleration” is more like it.

If you want more proof, I’ve got oodles. If Larry Hamlin himself comments here and asks, I’ll be delighted to show how wrong he is in another six ways. As for the 2nd graph from Hamlin (actually from NOAA, he didn’t make it himself): if he comments here himself and asks, I’ll be happy to show him why that too is a nothing-burger. In fact, for his purpose it’s a “not-even-wrong” burger.

This blog is made possible by readers like you; join others by donating at My Wee Dragon.

13 responses to “Sea Level Rise, Sea Level Lies”

1. On each of the NOAA SLR trend graph pages, such as The Battery’s https://tidesandcurrents.noaa.gov/sltrends/sltrends_station.shtml?id=8518750 , they also have a tab to show a time series of linear trends over 50 year windows:

That big smiley-shaped bit at the end looks an awful lot like acceleration to me.

• Jeff

I was also looking at the 50 year trend which shows 1945 and 1950 had a higher rate of rise than any recent rise.

[Response: Or … does it? What are the uncertainties in those numbers?

Has anybody (besides me) done a thorough analysis of the tide gauge data from New York (the Battery)? Has anybody looked at it to find out what’s really going on, rather than to find something to point to and say “neener-neener.” Larry Hamlin didn’t. Judith Curry hasn’t. But they’re the ones who want to lecture us about the subject.

It’s fun to criticize idiots. But let’s not replace one idiot idea with another, that just increases confusion.]

2. Mark B

I haven’t read the WUWT article and I’m not suggesting this is a proper approach, but a quadratic fit to the the last thirty years of the Battery data does have negative curvature.

[Response: Why yes it does — or does it? Fit a quadratic, the coefficient is -0.09 +/- 0.19 mm/yr/yr. All it really tells us is that the noise level is bigger than the curvature for this time span.]

I believe this is not the case with global mean sea level, though I checked only the satellite data set (1993-present).

[Response: Indeed. See this.]

3. Bruce W

I see comments like this on my local news site. And I need to point out, again-and-again, that where NOAA says “Linear Mean Sea Level trend”, that means they’re fitting the data to a linear model, not that the modeled reality is linear.

• Bruce W

Make that “Linear Relative Sea Level trend”

4. B Buckner

OK, valid points on Hamlin. You, on the other hand are comparing a best fit broken trend to the long term trend line. The best fit broken trend line probably gives a favorably low p-value. I knew a statistics guy once who ran Monte Carlo simulations for this type of scenario that indicate the real p-value for a broken trend is much higher and not close to being statistically valid. Plus your 30-year period starts at a low point and ends at a high point, inflating the trend. Lastly, the NOAA web site you and Hamlin reference shows large variations in 50-year trends over the length of the data set (variations in 50-year trends tab), with the current period no higher than
historical trends. With all this said, how valid is your comparison of the 30-year trend to the 162 year long trend?

[Response: I already computed it using a non-broken (i.e. continuous) trend, and allowing for the multiple testing/selection bias effect; the acceleration is for real. And that’s just *one* of the ways … there are several. But for this post, I just kept it as simple as possible … after all, Larry Hamlin might want to read it.

As for “starting at a low point,” I didn’t pick “the last 30 years,” Larry Hamlin did.

Really, there are lots of ways to skin this cat. If Larry Hamlin wants to know what they are, let him come here himself and ask.]

5. @B Buckner,

Generally speaking, skill of forecasting apparatus depend upon their success at out-of-sample or extrapolating judgments, and not hindcasts. Indeed, one can optimize the hindcast and overfit, leaving the model open for greater errors when future circumstances are encountered.

• Alex C

Hindcasting does not fit to the past. There is no conceptual difference between hindcasting and forecasting aside from the latter isn’t possible until enough time passes. Yes, if you fit to the past, you could overfit, but the goal is to find the model which best estimates the past after learning the present, and so you spare it as a test set, not a training set.

The potential failings of hindcasting, such as extrapolation errors, are better avoided using cross validation on the entire dataset. But these problems will arise with forecasting too, perhaps even more-so, because you cannot guarantee that the present will contain predictor values within the range you have so far observed. (With changing climates, that is certainly so.)

• Alex C

“… because you cannot guarantee that the *future* will contain…”

• @Alex C,

Regarding

Hindcasting does not fit to the past. There is no conceptual difference between hindcasting and forecasting aside from the latter isn’t possible until enough time passes. Yes, if you fit to the past, you could overfit, but the goal is to find the model which best estimates the past after learning the present, and so you spare it as a test set, not a training set.

The potential failings of hindcasting, such as extrapolation errors, are better avoided using cross validation …

I did not reply immediately because I wanted to think on it. At first it did not seem there was anything useful to say. A quick review of the statistical literature pertaining to meteorology shows there are a range of meanings assigned to the words hindcast and cross validation and even forecasting. Moreover, some of these meanings differ in substantial ways for how these terms are used in time series work, at least today, and at least for cross validation, forecasting, and prediction. These meanings appear to have changed over time. Thus, Michaelsen (1987) begins his Abstract with

Cross-validation is a statistical procedure that produces an estimate of forecast skill which is less biased than the usual hindcast skill estimates. The cross-validation method systematically deleted one or more cases in a dataset, derives a forecast model from the remaining cases, and tests it on the deleted case or cases.

Contrast Elsner and Schmertmann (1994) opening for their Abstract:

This study explains the method of cross validation for assessing forecast skill of empirical prediction models. Cross validation provides a relatively accurate measure of an empirical procedure’s ability to produce a useful prediction rule from a historical dataset. The method works by omitting observations and then measuring “hindcast” errors from attempts to predict these missing observations from the remaining data.

These statements contain a few remarkable comments or implications, at least with the benefit of hindsight and developments after 1994.

First, my understanding of hindcasting to ascertain skill of a forecasting mechanism is that the mechanism is calibrated in some manner, and shows skill on calibration (or “training”) datasets. A period in the past is identified and divided into two adjacent sections. The forecasting mechanism is initialized with the earlier of the sections and, with the calibration provided it early, it then is used to forecast the later section. Residuals between specific observables in the later section and ones estimated using the forecasting mechanism are used to develop a figure of merit.

Second, the modern idea of cross-validation goes beyond the leave-one-out cross validation which appears to have been mean by Michaelsen and then Elsner and Schmertmann. One can have leave-$k$-out cross validation and also generalized cross validation (Craven and Wahba, 1979). Generally speaking, akin to the spirit of Michaelsen and Elsner and Schmertmann, the leave-out cross validation entails removing arbitrary sections of a bigger dataset and evaluating performance on the complement, but, beyond that spirit, the modern view recognizes this needs to be done several times, with random sections removed, the final figure-of-merit being some combination of the figures in each validation. In the limit, and for weakly stationary time series, one gets procedures like the Politis and Romano stationary bootstrap from Politis and Romano (1994) and Politis (2003). Generalized cross validation appears less well known, but it is more efficient to calculate, even if it is less clear why it works. Wang reported on use of GCV for correlated data in 1998 (JASA). Carmack, Spence, and Schucany offered a “generalised correlated cross-validation” in 2011 (Journal of Nonparametric Statistics).

Third, conventional hindcasting, whatever that meant at the time, was seen as having drawbacks. On the other hand, the frequently cited paper by Jolliffe from 2007 mentions hindcasts as a motivating example, but otherwise does not discuss it, and, while mentions bootstrapping and in particular, moving blocks bootstraps (from Elmore, Baldwin, and Schultz, in 2006, as taken from the 1993-1997 statistics literature by Efron, Tibshirani, Davison, Hinkley, and Wilks), no mention is made of cross-validation or, for that matter, the 0.632+ bootstrap which Efron and Tibshirani reported in 1997 (in JASA) as being superior to cross-validation.

Now, there are grades of non-stationarity and dependence possible in a series. The classical context for both cross-validation and the bootstrap assumes observations in samples are independent of one another, and they have statistics which are drawn from distributions which respect something a little weaker than the Central Limit Theorem. Introduce dependency and now all kinds of interesting things are possible. These intrude in two ways. First, the correlation among observations can vary as a window is slid over time. Second, the variance of observations can itself be a function of time, and these variances might be correlated. Unless strong assumptions about stationarity are made, or an equally structural assumption about ARMA-type behaviors are made (basically that effects only last for a finite time, die out, and the more separated in time they are, the weaker the interdependencies), the length and placement of a window in a time series can affect what statistical behavior can be expected of it. Consequently, the very choice of an interval for hindcasting might well constrain how well or poorly a comparison performs relative to the present. Since the choice of the window needs to be informed by what happened during the window, specifically matching these meta-statistics of variance and correlation, and the future is not known, it seems to me there is very much a difference between hindcasting and forecasting.

Even if window lengths and placements in the past are done arbitrarily, in the spirit of the Politis and Romano stationary bootstrap, innovations in correlation structure and variance over time need to be qualified in order for results of hindcasting to be comparable to the results of forecasting.

Now, sure, ab initio knowledge of physics and other insights might give good reasons to believe that this kind of transference is possible and even plausible, but, in general, for time series it is not.

6. Jeff

Strangely enough the trendline for the last 10 years (Jan2008-Dec2018) is zero.

[Response: It’s not strange at all. Then 95% confidence limit uncertainty, using just the last 10 years, is +/- 6 mm/yr. In other words, the noise level in a 10-year time span is so high that your “trend” estimate tells us NOTHING of value at all.

Which is why you post about it! Because it causes confusion among the ignorant, without requiring anything at all of scientific validity.]

• jgnfld

Yes, Jeff. And “strangely enough” the 2016-2018 GISS annual anomaly values (.98,.90,.82) are declining at -.08 deg/yr. In a perfectly straight line with no residual error, no less!