# Sea Level PCA, from the Baltic Sea to el Niño

I’ve made a new estimate of global sea level based on a subset of all the tide gauge data in PSMSL (Permanent Service for Mean Sea Level). I’m just doing the 20th century, and I’ve used only those data sets with a decent amount of data, at least 360 monthly values since the year 1900. That leaves a “mere” 723 tide gauge stations to work with. Here’s where they are located:

And here’s the estimated global average sea level since the year 1900:

We can map the estimated corrections for vertical land movement (VLM), with upward-pointing blue triangles showing stations where sea level is rising faster-than-average because the land is sinking faster-than average, and downward-pointing red triangles showing stations where sea level is rising slower than average because the land is rising faster than average. Bigger triangles indicate farther from average:

One thing to note is that many of the stations with slower sea level rise due to rising land, are those in the far north. This region is where the great ice sheets used to weigh down the continents but disappeared during the last deglaciation, so the land is now rebounding (glacial isostatic rebound).

I corrected each individual station record for its estimated VLM and aligned it with the other records. Then I computed the difference between this aligned, VLM-corrected data and the estimated global signal shown above. This gives a residual sea level time series for each station, one which represents its departure from the global-average pattern which is unrelated to VLM.

I wanted to find which stations show similar patterns of changes, so I thought to apply principal component analysis (PCA). The problem is that “straight” PCA requires that each station report a data value at each time — missing values aren’t allowed. For sea level time series, it’s not just a case of stations missing a few values, many stations have large gaps or cover only a brief time span. The missing data problem is severe.

There’s a decent literature on PCA with missing data, much of which utilizes infilling — substituting estimated values for missing data. I’ve never cared for infilling, so I used a different technique. It has its strengths and weaknesses, but that’s a subject for another post.

I applied this PCA method to the residual series, and here’s the time series for the first PC:

It’s not a pattern of trend but of the fluctuations, which turns out to be the most prominent one which isn’t related to the global pattern or to vertical land movement.

Here’s how different stations match this pattern, with upward-pointing red triangles stations which show this pattern, downward-pointing blue triangle stations which show the opposite pattern, larger triangles for a stronger match:

The vast majority of the big triangles are the downward-pointing blue ones in the area of the Baltic Sea. Here’s another version of the same map, but for stations with values less than two standard deviations from the mean (which is most of them) I’ve replaced colored triangles with black x’s:

This makes it clear that mainly, the pattern shown in the time series (actually, it’s negative) matches the fluctuations shown by stations in the Baltic Sea area, as well as a few stations in odd locations.

For the 2nd PC the time series pattern looks like this:

The map looks like this:

All but a few of the triangles are small, the exceptions being a group of stations in North America. I can once again replace triangles within two standard deviations of the mean with black x’s, and zoom in on North America, giving this:

This shows that the 2nd PC is really a pattern of fluctations in stations along the St. Lawrence river in Canada, where water levels are strongly managed by human activity; these stations are a poor representative of global sea level. In this case, principal component analysis succeeded not only in identifying a genuine coherent pattern, it also defined a set of stations which should be excluded from a “best” estimate of global sea level.

The time series for the 3rd PC shows an interesting trend pattern, a decline since around 2005:

The map has most of its big triangles in far northern regions:

Again, let me replace triangles with x’s for stations within 2 standard deviations of the mean, to highlight where the strong activity is concentrated:

The pattern is mainly found in stations around the Arctic ocean, especially the coast of Siberia. A possibility, one I regard as speculative but interesting, is that the recent severe melting of glaciers has caused nearby sea levels to fall because there’s so much less ice that the local gravity in the area is reduced.

The fourth PC is the one I find most interesting. Here’s the time series as a blue line, and I’ve added a red line which is proportional to MEI, the Multivariate El Niño Index:

The match is outstanding; clearly the 4th PC in sea level deviations is the pattern induced by the el Niño southern oscillation.

The global pattern of response confirms what is expected:

Here’s another version, in which I’ve replaced triangles with x’s, not for stations within 2 standard deviations of the mean, but for stations within the interquartile range:

We see that during a strong el Niño, sea level is higher in the east Pacific, while it’s lower in the west Pacific and the Arctic ocean.

I was surprised, and pleased, by the results of this exploratory analysis. Finding the imprint of el Niño on sea level was an unexpected but especially satisfying result. There’s more to be done to understand quite what the results mean.

I think, that next I’ll re-do the global analysis but omit those stations along the St. Lawrence river in Canada — they just don’t belong in an estimate of global sea level. There are other individual stations which should be omitted too (e.g. stations which show a strong shift due to earthquake activity). Some might say there’s a lot of work to do on this. I might say, ain’t we got fun?

Thanks to Prof. Dr. Rutger Verbeek for a kind donation to the blog. If you’d like to help, please visit the donation link below.

This blog is made possible by readers like you; join others by donating at My Wee Dragon.

### 17 responses to “Sea Level PCA, from the Baltic Sea to el Niño”

1. jacob l

kinda surprised that none to the pc’s represent the rise in sea level.
Is there any reason for this ?

[Response: It’s by design; the “residuals” being analyzed are the *differences* between the station’s aligned/VLM-corrected data and the global signal. The global signal is that “rise in sea level” and has already been removed.]

2. mrkenfabian

“A possibility, one I regard as speculative but interesting, is that the recent severe melting of glaciers has caused nearby sea levels to fall because there’s so much less ice that the local gravity in the area is reduced.”

I didn’t think it was speculative.
This video shows sea levels falling near places where a lot of ice has been lost, especially around Greenland and Pacific Alaska – I don’t know how to embed the link but it is worth a view.

3. David B. Benson

Tamino, I opine that one must look elsewhere to explain the large magnitude of PC 3 on the coast of Siberia. Siberia is a long way from Greenland and the ice coasts of Eastern Canada along the Arctic Ocean. The video posted by mrkenfabian demonstrates that there is little effect in the eastern Arctic Ocean.

An alternative, especially given the recent change in PC 3, is that the land is subsiding as the methane leaks out of the permafrost. The Siberian Times has many articles with photographs of sinkholes growing and lakes bubbling. A hypothesis, to be sure.

4. Neil White

With regard to the falling sea level along (e.g.) the south-eastern coast of Alaska the apparent fall in sea level is because the land has risen at a number of locations due to the removal of ice mass, so a tide gauge attached to the land will show an anomalously low rate of sea-level rise.

This is an instantaneous elastic response in the Earth’s crust. This is related to GIA except that GIA is a function of the very viscous mantle (beneath the crust) which has response times of the order of thousands of years. The Eastern seaboard of the USA and much of Scandinavia are rising due to GIA related to the last Ice Age. Much of the rise in Alaska is due to a very quick response to modern-day ice loss.

There is also a gravitational effect due to the loss of ice mass. Jerry Mitrovica has done a lot of work on this and his work (and at least one online video) have been mentioned on this blog before. “Fingerprinting” is a term that is often used in discussion of this quick response to modern-day ice loss.

Neil

5. Neil White

Correction to my previous message:

The sentence in the second paragraph:

The Eastern seaboard of the USA and much of Scandinavia are rising due to GIA related to the last Ice Age.

Should be:

The Eastern seaboard of the USA is SINKING and much of Scandinavia is RISING due to GIA related to the last Ice Age.

I have no idea how I garbled that!

Neil

6. Neil White

(Possibly repeated) Correction to my previous post:

The sentence in the second paragraph:

“The Eastern seaboard of the USA and much of Scandinavia are rising due to GIA related to the last Ice Age.”

“The Eastern seaboard of the USA IS SINKING and much of Scandinavia IS RISING due to GIA related to the last Ice Age.”

Neil

7. This is very impressive. Thanks for sharing you work with us.

8. Thanks Tamino for your repeatedly good job.

I lack any education in the domain you show us here, and you are, as opposed to many scientists, a very good teacher.

I especially enjoyed the amazing PC4 corner.

J.-P.

9. Uli

“I’ve never cared for infilling, so I used a different technique. It has its strengths and weaknesses, but that’s a subject for another post.”
Which technique do you use instead of infilling?

[Response: I find the best fit of the data matrix values which are present, as a *simple* matrix (i.e. the outer product of two vectors). Best fit is by minimizing the sum of squares of differences (for observed values only, not for the missing values we don’t know). It shows some odd behaviors when many stations have very large numbers of missing values — but the results make sense and the odd behaviors seem tractable. I think the method needs more study, and some things like “centering” and “normalizing” get weird with missing data.]

• @Uli, @Tamino,

• Uli

Ok, interesting. So you using a best fit rank 1 approximation to the full matrix to full matrix to get plausible values for the missing data.
What about a higher rank k approximation? Have you tried it?

I’d used linear regression to fill in missing data, but had to look at the missing pattern before.
Fortunately, in most cases, I had PCA for model output and so no problems with missing data.

10. Many thanks for another interesting analysis. Keep having fun (and enlightening!)

Minor note, which you may already be aware of: The captions for figures 1 & 3 are badly ‘typoed’–“Tide Guage Stattions.” Ouch!

11. Another purely speculative explanation for PC3 would be changes in sea ice cover. The large change since 2005 points in this direction. The dynamic pattern of sea level height in the Arctic might be affected by changes in wind stress (a given wind acting on sea ice vs. open ocean) altering ocean currents and sea level height gradients. But I don’t even know the expected sign of this difference.

12. I’m pretty sure it’s “gauge”

13. numerobis

It’s unfortunate there’s no data on Greenland or the Canadian Arctic — not even the Arctic coast of Alaska. Looks like the tide gauges aren’t reporting to PSMLS — they exist, but PSMLS is showing data records for a year here or there.

• numerobis (and Tamino)

Oh, how interesting! An Egyptian architect :-) with some interest in Arctic tide gauge data.

It’s a pity indeed that though here are 10 PMSL tide gauges in Greenland, none provides for RLR data (only ‘metric’, no idea how to process that).

But for Alaska there are some:

1857; 70.400000; -148.526667; PRUDHOE BAY, ALASKA
2302; 55.061667; -162.326667; KING COVE, DEER PASSAGE
1634; 55.336667; -160.501667; SAND POINT, POPOF IS., AK
2304; 57.125000; -170.285000; VILLAGE COVE, ST. PAUL ISLAND
1179; 57.745000; -152.481667; ST. PAUL’S HARBOR, KODIAK
2301; 56.898333; -154.246667; ALITAK, LAZY BAY
567; 57.731667; -152.511667; KODIAK ISLAND, WOMENS BAY
1067; 61.238333; -149.890000; ANCHORAGE
1070; 59.440000; -151.720000; SELDOVIA
266; 60.120000; -149.426667; SEWARD
1353; 61.125000; -146.361667; VALDEZ
566; 60.558333; -145.751667; CORDOVA
2300; 58.195000; -136.346667; ELFIN COVE, PORT ALTHORP
445; 59.548333; -139.733333; YAKUTAT
426; 57.051667; -135.341667; SITKA
2299; 56.246667; -134.646667; PORT ALEXANDER
495; 59.450000; -135.326667; SKAGWAY
405; 58.298333; -134.411667; JUNEAU
225; 55.331667; -131.625000; KETCHIKAN

Unlike data cracks a la Tamino, Nick Stokes etc, I can generate anomalies only wrt one or more reference periods.

The average of anomalies out of those having data for the reference period 1993-2013 (13 of 21) looks like what I obtain for the stations in the Gulf of Bothnia between Sweden and Finland: