Trend and Cycle Together

Many climate signals show both trend and cycle (usually an annual cycle) together. A typical example is the concentration of carbon dioxide (CO2) in the atmosphere. If you look at the data (say, from the Mauna Loa atmospheric obsevatory) both the trend and the cycle are obvious.

It’s often useful to remove the cycle in order better to isolate the trend. When we do so, the de-cyclified data are often called anomaly data. When the data are monthly averages, the most common way to define anomaly is to compute a separate average for each month, then to define anomaly as the difference between a given month’s data and the average for that same month.

But when trend and cycle coexist and both are strong, the two can to a limited degree confound each other. Such is the case for atmospheric CO2 data. Let’s take a close look.

We’ll start with some artificial CO2 data for which the trend is exactly linear, and the annual cycle is a pure sinusoid. It’s hard to imagine a simpler test case for the combination of trend plus cycle. And here it is:

Let’s apply the usual procedure — we’ll compute the average value for each month, then subtract the appropriate monthly average from each data value. We hope that this will remove the annual cycle perfectly, leaving only the trend — which we know (because we designed it that way) is a perfectly linear increase over time. But that’s not what we get:

That’s not the perfectly linear trend we were hoping for! Instead it’s a series of step changes, with each year showing a constant value while the yearly averages increase steadily. What went wrong?

The root of the problem is that the two signals, a linear trend and an annual cycle, actually resemble each other slightly. This is perhaps clearest if we consider what would happen if we had only one year’s data. Then the monthly averages would be the individual monthly values, we would subtract them from themselves to define anomaly, and all the anomaly values would be zero, regardless what the underlying trend might be!

As another illustrative case, suppose there was a perfectly linear trend but no annual cycle. If we then define anomaly in the customary way, we’ll end up with the same anomaly values just shown. (It may be an enlightening exercise for the reader to do just that, and see what happens).

Another, less common but still popular, way to define anomaly is to approximate the annual cycle, not by computing averages for each month, but by fitting a Fourier series. I fit a 4th-order Fourier series to the artificial data, then took the residuals to define anomaly values, and got this:

It’s more of the same. Although the values don’t show exact step changes from year to year, they do so approximatly. The root problem is still there — that an annual cycle and a trend resemble each other slightly.

Suppose we use actual data (from Mauna Loa):

If we compute anomaly using the “subtract individual monthly averages” method we get this:

Once again it’s approximatly a step change each year. Using the “fit a Fourier series” method gives essentially the same result:

What to do?

If we know the form of the trend and cycle, then the best solution is to fit both of them simultaneously. These data only cover the time span from 2000 to the present, and for that period we know the trend is very close to linear. We can also approximate the annual cycle very closely by a 4th-order Fourier series. If we fit both patterns simultaneously to define what the annual cycle really looks like, then remove the annual cycle from the data to define anomaly, we get this:

Now we’re getting somewhere! These are realistic anomaly values.

Of course that was easy because we knew what the trend looked like (to an excellent approximation). As for the annual cycle, a Fourier series with enough terms can always mimic that quite closely. But what if we didn’t know what the trend looked like? What if it was strongly nonlinear? Then what would we do?

My usual practice is a three-step procedure. First I smooth the data, usually using a lowess smooth, in order to remove the trend approximately. I use a long enough “timescale” for the smooth that it can’t mimic the short-term fluctuations due to the annual cycle (at least, not enough to worry about). Then I take the residuals from that smooth, and use them to estimate the annual cycle. Finally, I subtract the estimated annual cycle from the original data to define my anomaly values.

If I do exactly that to the Mauna Loa CO2 data, I get this:

That’s more like it! In fact it bears a striking resemblance to what I got using the linear-plus-Fourier model (based on the fact that I knew the trend was approximately linear), but I don’t have to assume a linear trend, or any form of trend for that matter, so this works in all cases. I can even compare the results of the two methods:

As you can see, their results are so close it’s hard to tell the two estimates apart.

In many cases the trend and cycle are so very different in their sizes that it makes little difference. Then it’s reasonably safe to use the “subtract individual monthly averages” method. Examples are global average temperature, and sea ice extent. In such cases the error introduced by the simple method is small enough it can safely be ignored. But it’s still there, and for best results a more sophisticated method is a good idea.

In fact, if the data don’t cover a whole number of cycles (a whole number of years, for an annual cycle), other complications arise. It’s hardly necessary to abandon the simple “subtract individual monthly averages” method — it’s both effective and easy, and in most cases plenty accurate — but it’s still a good idea to be aware of the exceptional cases, and to be prepared for them.


10 responses to “Trend and Cycle Together

  1. Nice explanation and demonstration.

  2. Eye-opening–for me at least!

    Cycle-trend resemblance–because (for instance) a sinusoid represents the one-dimensional vector velocity of a point tracing a circular orbit at constant velocity?

  3. As usual, Tamino, your statistical explanation educates, entertains, and motivates me. You mention the sea ice data — I think it’s a bit challenging isn’t it? — because the annual cycle has a trend to it such that the summer extent is declining faster than the winter? Maybe this is an exercise I should try for myself.
    I will ask, instead, about your process if the cycle rather than the trend is the most interesting thing. I work with fish, and their long term trends can be affected by so many factors. But short term apparent cycles are perhaps more tractable, and can be intructive biologically. Do you use a different procedure or do you basically follow the same approach thru step two of the procedure you describe above?

    [Response: When the cycle is changing in interesting ways, I like to apply wavelet analysis.]

  4. Why not just use a least squares fit to a model incorporating both 12 monthly offsets and a trend? I used that here; the arithmetic is analogous to OLS regression.

    [Response: No reason not to. It’s analogous to the simultaneous fit of trend and Fourier series, but using 12 monthly offsets rather than a Fourier series for the cycle. More than one way to skin a cat.

    Of course that presumes you know the form of the trend (at least approximately). And there are cases in which the data are not evenly spaced in time, or in which the time spacing is not commensurate with the period (weekly data with an annual cycle), in which case it’s not possible to define anything like “monthly offsets.”]

  5. Could you, just in short, provide some highlights demolishing this rubbish? It seems very much inspired by (or aimed at) you.

  6. The rubbish hardly needs to be demolished. Doing statistics does not relieve one of the obligation of also doing science. The folks at WUWT don’t seem to realize that.

    Who cares what kinds of “steps” one can or cannot statistically extract from station data? That doesn’t change for a moment the fact that the world is warming, and that we are responsible for it.

  7. While it’s true that there’s “more than one way to skin a cat,” by far the best way to pick out the trend in a signal with a periodic component is to use a moving average or boxcar filter of width equal to the period p.

    In practice such a component tends to be quasiperiodic, in which case the boxcar filter works merely “pretty well.” But if the component signal s(t) happens to be perfectly periodic, meaning that s(t) = s(t+p) for all times t, the boxcar filter removes exactly 100.00% of the period p component! No ifs, ands, or buts about it.

    Although moving-average filters are usually justified by their ease of implementation (what programmers care about), linear-time operation (a computational complexity property), and finite support (an important wavelet property), in hindsight one might say that they were invented to remove periodic signals, for which the rectangle is the uniquely optimal wavelet.

    To see this in the frequency domain, use the fact that the Fourier transform of the box is the sinc function sin(x)/x, whose zeros are at all the harmonics of the corresponding fundamental frequency, which is sufficient to completely kill any periodic function of that period. However it is even easier to see this in the time domain without talking about frequency, left as an easy exercise (hint: consider what happens to the periodic component when using the obvious implementation).

    The Mauno Loa data illustrates this nicely as can be seen
    here. Note the essentially complete absence of leakage of high frequencies.

    The period since 2000 as illustrated above can be seen here. If anything it is even smoother, having no Pinatubo eruption (1991) to throw things off.

    One might expect that any width very close to 12 would work just as well. This is easily refuted with boxcar filters of width 11 and width 13 where the leakage of high frequencies is plainly visible, as it is also in other non-boxcar filters, for example all filters used in the original post.

    These examples underline the point that a perfect boxcar filter is perfect for removing one perfectly periodic signal, a criterion that the annual fluctuations in the Mauna Loa data would appear to come very close indeed to meeting. When the periodic component is so perfect, even slight departures of any kind from the applicable boxcar filter are easily seen.

    To remove more than one periodic signal, apply the respective filters consecutively. This can be done in either order since convolution is commutative. It is also associative so grouping is irrelevant.

    Convolution is not however idempotent: applying the same filter twice has the effect of squaring the corresponding frequency response, thereby reducing the side lobes of the sinc function at the expense of also reducing frequencies lower than but still near the rejected fundamental. This is not relevant to modern CO2 data, which has only one visibly periodic component, but is very relevant to the multiple periods encountered in temperature data.

    [Response: A boxcar filter is fine, but it’s hardly the panacea you imply and it has some important drawbacks. Frankly I’m not a fan. For one thing, you lose half a period on each end of the time series. For another thing, it does more than just remove the periodic fluctuation, it also smooths the data. If you’re looking for sudden changes in the trend (apart from the cycle) then a boxcar filter rather spoils things.

    When there’s more than one way to skin a cat, there’s usually a reason. I advise extreme skepticism about claims that some particular one is “by far the best.”]

  8. For any convolution filter, not just the boxcar, the missing half-period problem is typically dealt with by suitably extending the time series. The Met Office Hadley Centre for example takes this approach in conjunction with their 21-point binomial filter, namely by extending each end with 10 copies of the last value at that end. They point out that this is not the only possible extension, but while suboptimal numerically in general, their choice of extension at least has the benefits of simplicity and transparency, the latter being particularly important in disputes.

    (As a side remark that may be of interest, the 21-point binomial wavelet they use is indistinguishable in practice from a Gaussian wavelet, whence its frequency response is also Gaussian, but has the advantages of finite support, trivial arithmetic when done right–each value in the result can be computed with nothing but ten additions and ten divisions by two—and is ideal for spreadsheets.)

    Regarding your notion of “sudden change in trend,” why is this not an oxymoron? Trends aren’t ordinarily thought of as including any of the high frequencies in a signal. If the cycle were to change suddenly in period, amplitude, and/or shape for say two periods, the definition of “sudden change in trend” implied by your implementation of the notion would classify any sudden change in the cycle as instead being a sudden change in the trend.

    The boxcar filter attenuates the signal to the extent of removing 1 – 2/pi = 36% of frequencies at half the filter’s fundamental frequency, and 1 – 3/pi = 4.5% at one-sixth its frequency. Hence any changes in the trend taking significantly longer than a period will be perfectly visible, even if not 100% to scale when close in time to the period.

    For changes taking significantly less than a period, what I’m having difficulty seeing is how to distinguish a sudden change in the trend from a sudden change in the cycle. Your cycle-averaging method does so by the simple expedient of denying the possibility of the latter. In effect you’ve made the cycle’s parameters the trend.

    [Response: Extending the time series is a very poor solution — in fact, not a “solution” at all — to the problem of losing half a cycle at each end. That’s one of the things I dislike about a boxcar filter.

    One can easily construct a signal which has a “sudden change in the trend” which has nothing to do with any change in the cycle, and it will show response at as high a frequency as your sampling will allow.

    No, cycle-averaging doesn’t deny the possibility of a change in the cycle. It does allow one to detect cases in which the trend evolves suddenly but the cycle doesn’t, if the cycle is regular enough to enable separation of the two. Of course to do so it really helps not to erase all the high-frequency behavior with, say, a boxcar filter.

    As I said before, a boxcar filter is fine. But I personally prefer other methods, and I think your blanket statement that it’s “by far the best way” is just plain wrong.]

  9. One can easily construct a signal which has a “sudden change in the trend” which has nothing to do with any change in the cycle, and it will show response at as high a frequency as your sampling will allow.

    Agreed, and I don’t use Hadley’s extension method myself for precisely that reason: I’ve seen it give dreadful results. Making p/2 copies of the last element magnifies its impact to a ridiculous degree.

    Your approach of estimating the typical cycle over the long term, say 2-4 cycles, provides a much better extension method, provided it is used in combination with also estimating the long term behavior of the trend, thereby sharing the intrinsic Heisenberg uncertainty between the two. The ideal extension in a Mauna-Loa-like situation blends your smoothed cycle with the smoothed trend. Basically it amounts to a best estimate of what the next half-period of the Mauna Loa data is likely to look like, which obviously won’t be the flat-line projection proposed by the Hadley Centre!

    Of course to do so it really helps not to erase all the high-frequency behavior with, say, a boxcar filter.

    Touché. Convolution filtering with any wavelet has the dual problem to your approach, denying the possibility of a sudden change in the trend.

    Neither of us however seems to have a reliable way of deciding whether a change shorter than the period of a cycle should be ascribed to the cycle or the trend. Nor will we ever have one, by Heisenberg uncertainty, the scientific basis for agreeing to disagree.