New York (the Battery)

We have measured sea level at the Battery in New York for over 150 years, from 1856 to the present, albeit with a 14-year gap from 1879 to 1893. The monthly-average sea level data are available online from NOAA (as are the data from hundreds of tide gauge stations around the world). They even provide a convenient graph:

If all you did was look at that graph naively, with no analysis of the data behind it, you might be tempted to think that “Sea levels have been rising at a fairly steady pace since at least the mid-1800s.” But if you study the graph, and the data behind it, you find that’s not true at all.

It shows the data itself (the jagged line in blue) together with the “best-fit” straight line (the one which matches the data better than any other straight line). As such, it estimates the straight-line trend and its rate of sea level rise, which is a good estimate of the average rate since 1856. In this case, it’s rising at 2.88 mm/yr.

But — if the rate of sea level rise has changed (something we call “acceleration”), then the average rate since 1856 is not the same as the rate now.

Here’s my graph of the same data:

We can simplify our graphs and our analysis, without losing information about the long-term trend, simply by working with yearly average sea level rather than monthly average sea level. That looks like this:

The slope of the best-fit straight line is only an estimate of the rate of sea level rise at NY, but a reasonably precise one: my analysis says the average rate since 1856 has been 2.88 mm/yr, within a “margin of error” of 0.16 mm/yr, meaning it (95% probability) isn’t less than 2.72 or more than 3.04. If you’re curious why my “margin of error” is bigger than that given by NOAA, see the note at the end of this post.

What if we looked at only the recent data, since 1990? We could fit a straight line to estimate its trend, and the margin of error for our estimate, and it looks like this:

The estimated rate is now 4.41 mm/yr, with a margin of error 1.38. It’s (probably) somewhere between 3.03 and 5.79. That’s a pretty wide range, but it definitely doesn’t include 2.88 — the overall average. This window on the data suggests that recently, the rate of sea level rise has been faster than its overall average — at least, at New York.

If we do the same to the 30 years before that, we get this:

This rate estimate is a mere 1.67 with margin of error 1.32, so it too has a wide range of uncertainty (smaller time spans generally do), but it’s still (probably) not less than 0.35 or more than 2.99. It could be 2.88 as the entire time span suggests — that’s on the high end but still plausible — but it’s definitely nowhere near the 4.41 we saw in the most recent period. The difference is considerable; according to these data, over the last 32 years the tide has risen at NY 2.6 times as fast as it did the previous 30 years. It’s starting to look like the rate of sea level rise at New York has not been steady.

I can do the same for segments of 30 years or so over the entire observed time span:

It’s definitely starting to look like the rate of sea level rise at NY has not been steady.

You might be wondering, why did I pick 30-year time spans? Why start the latest one at 1990? Did I try lots of possibilities, and pick the one which most makes it look like the rate of sea level has changed?

That’s a good question, and a very important statistical point. Allowing yourself to make that choice, dramatically increases your chance to find something that “looks like” it’s significant, but is really just a random accident due to random fluctuations. It’s like buying a lot of lottery tickets instead of just one: the lottery is still random (at least, we hope it is) but you have more chance to “win” by accident because you took more chances. It’s called the “multiple testing” problem, also known as “selection bias.”

That’s why we apply very stringent tests to models that allow that kind of choice, before we declare they are “statistically significant” with any degree of confidence.

In this case, the “set of straight lines over segments starting at 1856, 1900, 1930, 1960, and 1990” model passes those tests. Yes, the rate of sea level rise at NY has changed over time.

Although the “five straight lines” model is better than the “one straight line” model (the one with constant rate of sea level rise), and the improvement is “statistically significant” (it passes those tests), an even better model — both statistically and physically — is five straight lines which meet at their endpoints.

Such a model (straight line segments meeting at their end-points) is a good general-purpose model which allows for the rate to change, and is sometimes called a PLF, or “Piece-wise Linear Fit.” The straight-line model all by itself imposes constant rate as a constraint; the PLF retains the simplicity of the straight line but accomodates changing rates.

The PLF allows the rate to change, but only at the “breakpoints” between segments. Hence just as the single-straight-line model estimates the average rate (and its margin of error) during the whole time span, each segment of the PLF estimates the average rate (and its margin of error) during each time segment.

I’ll graph those estimates as a solid blue line showing the rate during each time segment, together with light blue shading to show its uncertainty range:

The thin dashed line at 2.88 mm/yr shows the single-straight-line rate, but it’s clear that the actual rate has usually been significantly higher or lower than that. There are two episodes of pronounced sea level rise, from 1930 to 1960 and from 1990 to the present. The current rate is highest of all.

There are other kinds of models which allow for a changing rate, and aren’t tied to specific time spans. One of the workhorses of my toolkit it the lowess smooth, which I’ve programmed to compute both the rate of change and its uncertainty (margin of error), and when I use it on the data from NY (the Battery) it suggests that sea level has changed thus (in red, compared to the PLF estimate in blue)

Both methods reveal that there are two episodes of faster-than-average sea level rise at NY, and that the latest is going on right now. They both suggest that the current level of sea level rise at NY is higher than it has been before. And yes, their results are “statistically significant.”

The data from NY (the Battery) are an excellent introduction to what sea level has been doing, especially along the east coast of the U.S., and how sea level rise has gotten faster, then slower, and now is faster than ever. A similar pattern is present in lots of tide gauge records (but not all), and also reveals itself in global mean sea level; that too shows a complex pattern of change in the rate of sea level rise.

Those changes are easy to see in global sea level since the year 1900, but not so easy to see in data from a single location (like NY). We had to do more than just “look at a graph,” we even did some actual analysis. Still, it was possible because NY covers such a long time span and gives us so much data to work with. Not every tide gauge record yields such results, because of the high noise level in single-station records. The global mean has a much lower noise level, making rate changes easier to quantify.

A high noise level always makes rate changes hard to confirm and even harder to see. That’s one of the reasons, I think, that those who want to dismiss or minimize the danger of sea level rise, so often will show a single graph of a single tide gauge station (often the Battery in New York) and declare that sea level rise has been steady. They wish to imply that it will remain so in the future, and that the rate right now is no more than the long-term average. They can’t deny that sea level is rising, that’s too much nonsense for anybody, but by plotting a single graph of a single station with a straight line already on it, you can get — and they can give — the wrong impression.

The tide gauge station at the Battery in New York is a prime example. It’s also a prime example of the fact that if you look closely, the “steady rate” story starts to fall apart, and if you do the math, the “steady rate” story crumbles.

Who would do that, show a single tide gauge record and then declare “steady rise” with no analysis behind it? The so-called “Heartland Institute,” for one. They created an entire document targeting (according to them) teachers and students, purporting to give “facts” about climate change but actually designed to give the wrong impression. When it comes to sea level rise, they show a single tide gauge record, do no analysis at all, and declare (direct quote) “Sea levels have been rising at a fairly steady pace since at least the mid-1800s.” The record they chose is: the Battery in New York.

Judith Curry, for another, when she talked about how sea level rise will impact New Jersey from “a business perspective.” She showed a single tide gauge record, did no analysis at all, and said (quote) “Since 1910, sea level has been rising at a steady rate of 1.36 feet, or 16 inches, per century.” At least she chose the data from Atlantic City, NJ.

Rutgers University put a lot of effort and expertise into advising the state of New Jersey about dealing with future sea level rise. Judith Curry’s advice is to pay no heed to their estimates of how much sea level rise New Jersey will have to deal with in the near future. Woe betide the garden state, if they do as she suggests.

That’s why I call the “Heartland Institute,” and Judith Curry, deniers.

[Note: NOAA puts a “+/- 0.09” after their trend estimate, while I put a “+/- 0.16”. That number is the estimated uncertainty in the trend rate. The estimate is based on the behavior of the noise, the random fluctuations in the data. We both recognize that the noise isn’t “white noise” (the simplest kind). They use an AR(1) model for the noise, I use an ARMA(1,1) model, and my model usually allows more influence from the noise so it returns larger margins of error.

The salient point is that because of that, my calculations make it harder for tests of rate change to reach “statistical significance.” But they reached it anyway.]

This blog is made possible by readers like you; join others by donating at My Wee Dragon.


7 responses to “New York (the Battery)

  1. Tamino may have well done this, but to elaborate on a point further …

    Suppose there’s no knowledge of where breaks in a series are supposed to go? The details of the model don’t matter that much: Tamino uses piecewise linear models. I prefer smoothing splines. In the former it’s where the breaks are. In the latter, it’s where the spline knots are placed.

    It won’t be done this way because this way is too computationally expensive, but what is actually done is equivalent to the description below.

    Suppose a program considers all possible places for breaks to be placed? Now, there are some silly cases that won’t happen. For instance, it doesn’t make sense to have breaks right next to one another, or even two points away from one another. It also doesn’t make sense to have breaks at the extreme right or left of the series. But granted all that, how do you choose between them? One way is to use some measure of the deviation between the predicted fit with the piecewise line segments (or the spline, but I’ll not speak of splines in this comment hereafter) and the data points, and sum those up, or perhaps sum up their squares. This could be the variance or the root-sum-squares of the residuals. And in one sense, the choice of breaks across all possible legitimate choices which minimizes that quantity is the “best”. It certainly fits the data the best.


    How is one to know if the same scheme will work for data which either hasn’t been seen, or wasn’t collected? Might it not be the case that a rule which predicts all the data seen super well might not work for deep past or for future data? Yes, that’s definitely possible. So, what’s there to be done? Wait for another 20 years of data and see?

    Well, that could be done, but it’s inconvenient, particularly if the model of breaks needs to be scored now. So here’s an idea. Suppose the kinds of variability, even non-stationary variability, is more or less of the same character in the future (or the past) as now. This is not the same as saying the trend or acceleration in the trend are the same, but the variability is the same. Note that the idea of understanding how good the model is for the future can be thought of as having the data points for the future, but someone has hidden them from you.

    So, suppose some set of the data points in the series in hand are artificially hidden, and that subset is chosen at random. And suppose the program takes the breaks it found with all the data, throws out the ones which correspond to hidden points, and then recalculates the figure-of-merit for this configuration. And that gets noted. Then the program goes back and randomly deletes another set of points from the original and does this again. And then the program does this again, so on, for many times.

    And suppose the program saves the versions of the series with points deleted, and in each case it pretends those were the original data and finds ideal breaks in the manner of the original, and it save those for each version.

    There are then two additional ways to score the original break choices.

    One is to score them by some weighted combination of the figure-of-merits for all versions of randomly chosen points, and use that combination as the scoring function for the original choice of breaks instead of the root-sum-of-squares or whatever. So, this means at each breakset choice, it goes off and does a bunch of these evaluations with randomly deleted points, and comes up with a score to evaluate that breakset.

    The second way to evaluate break choices is to compare the placement of the original breaks and their number with the placement of breaks and number for each jackknifed series. The scoring rule for that is a little involved, but I’ll just argue it’s plausible to see how that could work: if the number of breaks is substantially different, or if the pattern of placements is substantially different, that suggests the models of the overall isn’t a good model for the jackknifed series. So it may not be a good model of the series yet to come. Of course, that’s just one jackknifed pattern and it itself may be a statistical fluke, but if many of the jackknifed patterns disagree with the original overall pattern, then it’s arguable that it’s the original breaks pattern which is the oddball.

    This kind of procedure can be used to find breaks which work for so-called “out of bag” situations, for data which are not yet in hand.

    This kind of procedure is, in fact, the cornerstone of many modern statistical procedures, including ones built into so-called statistical learning algorithms.

  2. Concerning piecewise linear fits: to me they seem to be “unphysical”, because why should processes in nature have discontinuous derivatives?
    This is an open question, a starting point for exploration. Under which exact conditions does it make sense to assume discontinuous derivatives?
    I do like your lowess fits much more, because they are continuously differentiable .

  3. Yes, breaking the data into smaller segments naturally provides a better fit. It has no value in projecting the future though. What is the next segment going to do? What would be interesting is to start at the beginning of the data series and calculate a linear trend, sequentially adding thirty year segments. How good were earlier linear trends in predicting the future? How have they varied and by how much? The recent measurements are below the long term linear trend for that is worth. Probably nothing.

    • Did the piecewise linear have “predicting the future” as its goal? Absolutely not. It was a description of what happened.

      In general extrapolation is a terrible way of “predicting the future.” At the very least some kind of dynamical model with sensitive state and update structure is needed to have any chance of doing that. Such models typically have prediction errors which are close in the near future and grow as the time ahead increases. Weather forecasting models have this characteristic, too, although they are not, strictly speaking, dynamical models.

      Integrated climate models aren’t dynamical models at all and use a bunch of different forecasting devices and, as well, are ensemble models.

      An example of a relatively simple numerical model which can do forecasts and is not a dynamical model are boosted models.

      • Extrapolation may be the best we have, if we have no good physical underlying mechanism at hand. It may even be as good as or better than an undercomplex physical model.
        We could make some reasonable assumptions about a complex system, like that it is in some definition “well behaving”, e.g. a Taylor series or some other kind of series be a good fit for a certain time interval ahead. In a system with many small feedbacks this may be not so bad. Problem is, when some feedbacks become large, like a modus change, like stop of AMOC or so.

      • Extrapolation may be the best we have, if we have no good physical underlying mechanism at hand. It may even be as good as or better than an undercomplex physical model.

        (Emphasis added.)

        Um, thermal expansion of seawater and the geodynamics of excess heights are pretty basic geophysical things. Even the ice sheet contributions are well-modeled now, or at least vastly better than they were 5-6 years ago.

        As Professor Syukuro_Manabe, the recent Nobel laureate, answered when I asked about ice sheets after giving a lecture at Harvard a few years back, “Yes, it’s difficult, but computers are getting very fast. We have to model it.”

  4. This note is on Covid so it is off topic for this thread.

    The New York Times lists the amount of Covid in each state, using data from the CDC. In the list today (November 20) the top 10 states include 8 states that are Democratic run and had low covid two months ago and only 2 Republican states. Two months ago 8 of the top 10 states were Republican states. The bottom states are now 7 Republican states and only 3 Democratic states. Two months ago the lowest states were 90% Democratic.

    It appears to me that the states like Florida (where I live) that took no measures to control covid, and encouraged mixing with no masks, have reached herd immunity. Many states that flattened the curve appear to have relaxed their vigilance and with everyone indoors are now having issues. The delta covid is so catching that everyone who is not vaccinated is getting sick. It is interesting that states like Louisiana that have low vaccination rates (48%) have reached herd immunity from encouraging sickness while Maine, with a high vaccination rate (72%), now has high disease. Europe seems to be in the same boat as Maine. If 28% of the population are not vaccinated and masking is relaxed there are enough susceptible people to have problems. Published data suggests that there are 2 undiagnosed cases of covid for every reported case. In late May, 80% of the population had covid antibodies while only 50% were vaccinated.

    If the states that encouraged covid have reached herd immunity, how long will it take for the careful states to reach herd immunity? Will covid retreat from the entire country soon? It stands to reason that a state like Maine would have less susceptible people than Louisiana had three months ago. Then their curve would be high for a shorter amount of time. How much less?

    Tamino, what do you think?