Many climate signals show both trend and cycle (usually an annual cycle) together. A typical example is the concentration of carbon dioxide (CO2) in the atmosphere. If you look at the data (say, from the Mauna Loa atmospheric obsevatory) both the trend and the cycle are obvious.
It’s often useful to remove the cycle in order better to isolate the trend. When we do so, the de-cyclified data are often called anomaly data. When the data are monthly averages, the most common way to define anomaly is to compute a separate average for each month, then to define anomaly as the difference between a given month’s data and the average for that same month.
But when trend and cycle coexist and both are strong, the two can to a limited degree confound each other. Such is the case for atmospheric CO2 data. Let’s take a close look.
We’ll start with some artificial CO2 data for which the trend is exactly linear, and the annual cycle is a pure sinusoid. It’s hard to imagine a simpler test case for the combination of trend plus cycle. And here it is:
Let’s apply the usual procedure — we’ll compute the average value for each month, then subtract the appropriate monthly average from each data value. We hope that this will remove the annual cycle perfectly, leaving only the trend — which we know (because we designed it that way) is a perfectly linear increase over time. But that’s not what we get:
That’s not the perfectly linear trend we were hoping for! Instead it’s a series of step changes, with each year showing a constant value while the yearly averages increase steadily. What went wrong?
The root of the problem is that the two signals, a linear trend and an annual cycle, actually resemble each other slightly. This is perhaps clearest if we consider what would happen if we had only one year’s data. Then the monthly averages would be the individual monthly values, we would subtract them from themselves to define anomaly, and all the anomaly values would be zero, regardless what the underlying trend might be!
As another illustrative case, suppose there was a perfectly linear trend but no annual cycle. If we then define anomaly in the customary way, we’ll end up with the same anomaly values just shown. (It may be an enlightening exercise for the reader to do just that, and see what happens).
Another, less common but still popular, way to define anomaly is to approximate the annual cycle, not by computing averages for each month, but by fitting a Fourier series. I fit a 4th-order Fourier series to the artificial data, then took the residuals to define anomaly values, and got this:
It’s more of the same. Although the values don’t show exact step changes from year to year, they do so approximatly. The root problem is still there — that an annual cycle and a trend resemble each other slightly.
Suppose we use actual data (from Mauna Loa):
If we compute anomaly using the “subtract individual monthly averages” method we get this:
Once again it’s approximatly a step change each year. Using the “fit a Fourier series” method gives essentially the same result:
What to do?
If we know the form of the trend and cycle, then the best solution is to fit both of them simultaneously. These data only cover the time span from 2000 to the present, and for that period we know the trend is very close to linear. We can also approximate the annual cycle very closely by a 4th-order Fourier series. If we fit both patterns simultaneously to define what the annual cycle really looks like, then remove the annual cycle from the data to define anomaly, we get this:
Now we’re getting somewhere! These are realistic anomaly values.
Of course that was easy because we knew what the trend looked like (to an excellent approximation). As for the annual cycle, a Fourier series with enough terms can always mimic that quite closely. But what if we didn’t know what the trend looked like? What if it was strongly nonlinear? Then what would we do?
My usual practice is a three-step procedure. First I smooth the data, usually using a lowess smooth, in order to remove the trend approximately. I use a long enough “timescale” for the smooth that it can’t mimic the short-term fluctuations due to the annual cycle (at least, not enough to worry about). Then I take the residuals from that smooth, and use them to estimate the annual cycle. Finally, I subtract the estimated annual cycle from the original data to define my anomaly values.
If I do exactly that to the Mauna Loa CO2 data, I get this:
That’s more like it! In fact it bears a striking resemblance to what I got using the linear-plus-Fourier model (based on the fact that I knew the trend was approximately linear), but I don’t have to assume a linear trend, or any form of trend for that matter, so this works in all cases. I can even compare the results of the two methods:
As you can see, their results are so close it’s hard to tell the two estimates apart.
In many cases the trend and cycle are so very different in their sizes that it makes little difference. Then it’s reasonably safe to use the “subtract individual monthly averages” method. Examples are global average temperature, and sea ice extent. In such cases the error introduced by the simple method is small enough it can safely be ignored. But it’s still there, and for best results a more sophisticated method is a good idea.
In fact, if the data don’t cover a whole number of cycles (a whole number of years, for an annual cycle), other complications arise. It’s hardly necessary to abandon the simple “subtract individual monthly averages” method — it’s both effective and easy, and in most cases plenty accurate — but it’s still a good idea to be aware of the exceptional cases, and to be prepared for them.