A post by Willis Eschenbach introduces something called the”periodicity transform” and extolls its virtues, suggesting that it’s superior in many ways to Fourier analysis. The method is very interesting, and in some situations quite useful, but it’s not the glorious improvement Willis seems to think it is — nor does Fourier analysis suffer from the flaws Willis seems to think it does. It’s also a recent method, but like most such innovations the core idea is not entirely new — there are strongly related methods that have been around for quite some time.
The essence is this: given a time series of values which are evenly spaced in time, one can test a possible period by finding the function which is periodic with the given period which most closely matches the data. One can test all possible such periods, and thereby identify which ones give “good enough” fits to correspond to real periodic behavior in the data. One catch: the periods you test must be multiples of the time spacing between data points. Hence if your data are monthly averages (spaced one month apart) the periods you test must all be multiples of one month.
Most of us are familiar with related calculations, because in climate science one so often deals with anomaly values. Calculating anomaly is one way to remove an annual cycle from data. Temperature at some particular location, for instance, usually shows a strong annual cycle due to the seasons; winter is colder and summer is hotter. We can remove this by computing anomaly as the difference between a given month’s temperature, and the long-term average for that same month. The resulting “anomaly” values make changes in the long-term behavior much easier to see; by clearing the (usually very large) seasonal variation out of the way, the long-term trend is made clearer.
The focus of anomaly calculation is to remove the seasonal pattern (the average for each month separately). The focus of “periodicity analysis” is the seasonal pattern itself, the monthly averages. Computing such averages shows that if we group together all the data values which are 12 months apart (all the Januarys, all the Februarys, etc.) they show similar behavior, while different months really do have different averages. This confirms that there is indeed a seasonal pattern to local temperature. This is based on the idea of periodic behavior, that there is a pattern which repeats every 12 months. If instead we computed averages based on assuming the pattern repeated every 13 months instead of 12, we would find that such “seasons” were out of sync with reality.
Here, for example, is temperature data for Portland, ME during the 1950s:
The seasonal pattern — a 1-year cycle — is plainly visible. If we plot the data by month, all the Januarys plotted at the same x-location, all the Februarys at another x-location, etc., we get what is called a folded plot:
Clearly all the Januarys have similar temperature, as do all the Februarys, etc., but different months can show dramatically different temperatures (compare January to July).
It’s called a “folded” plot because what we’ve effectively done is to “fold” the time axis around in a circle, in order to align all the Januarys with each other, all the Februarys with each other, etc. Doing so brings the temperature values into close alignment, and that’s what tells us that yes, there is a repeating cycle of 12 months (the period at which the time axis is folded). We can even compute averages for each month separately, which clearly reveals not only that there is a 1-year cycle, but what that cycle looks like:
Suppose we had folded the data with a 15-month period instead of 12, aligning all the data values that were 15 months apart. We’d get this:
There are now 15 steps in each cycle rather than 12, which is why I’ve labelled the x-axis “step” rather than “month.” Now we see that the temperatures of the steps don’t align very well. If we compute averages for each step, we no longer see a clear pattern:
We can even apply statistical tests to determine whether or not the pattern, given the period with which we’re “folding” the data, is meaningful or not. For a 15-month period, it isn’t. We conclude that there’s no evidence in this data set that temperature shows a 15-month period, but there is very clear (and very strong) evidence of a 12-month period, the well-known cycle of the seasons.
That’s the essence of the “periodicity transform.” We test a given period by seeing whether or not the data values tend to match when we fold the times by the given period. To make them align at all, the period has to be a whole number of months (or whatever the basic time spacing is).
The actual “periodicity transform” is a way to decompose the data into periodic components. Such a decomposition is not unique, but different choices can be made which make it useful. But when searching for periodic behavior, we don’t have to carry out this decomposition, we just have to determine how well a given test period reveals cyclic behavior in the data. That’s the essence of the “periodicity transform” as applied by Willis Eschenbach in his WUWT post.
Unfortunately, Willis doesn’t really understand the subtleties of the method, or of Fourier analysis, so he has mistaken notions about the advantages and disadvantage of each. For instance, the first advantage he lists for the periodicity transform over the Fourier transform is this:
Advantage: Improved resolution at all temporal scales. Fourier analysis only gives the cycle strength at specific intervals. And these intervals are different across the scale. For example, I have 3,174 months of sunspot data. A Fourier analysis of that data gives sine waves with periods of 9.1, 9.4, 9.8, 10.2, 10.6, 11.0, 11.5, and 12.0 years.
Periodicity analysis, on the other hand, has the same resolution at all time scales. For example, in Figure 2, the resolution is hourly. We can investigate a 25-hour cycle as easily and as accurately as the 24-hour cycle shown. (Of course, the 25-hour cycle is basically a straight line …)
Willis really doesn’t understand what he’s talking about.
If we use Fourier analysis to decompose a signal, then we usually restrict ourselves to a limited set of frequencies. We can call these the “Fourier frequencies,” and they’re integer multiples of 1/T where T is the total time span. The decomposition is therefore restricted to periods which are an integer divisor of the total time span, which gives the list of periods Willis enumerates.
But if we’re testing for periodic behavior, we’re allowed to use any frequency (any period) we want. In fact it’s commonplace to test far more frequencies than just the Fourier frequencies, a procedure referred to as “oversampling” the periodogram. So yes, Fourier analysis can detect and quantify periods which are not an integer divisor of the total time span. The frequency which gives the strongest response is our best estimate of the signal frequency (and therefore period), whether it’s one of the restricted set of “Fourier frequencies” or not.
In fact it’s “periodicity analysis” which has this limitation, not Fourier analysis. For periodicity analysis the period you test must be a multiple of the time spacing (for monthly data, an integer number of months). If you fold the data with some period which doesn’t meet this criterion, then the time values don’t line up and you can’t compute averages by “step” — so you can’t even compute the periodicity transform as defined by Eschenbach.
Perhaps most important, when data are a combination of a periodic signal and noise, the noise makes our period estimate uncertain. And neither Fourier analysis nor periodicity analysis enables us to improve the fundamental uncertainty.
Willis tries to demonstrate his point using sunspot data:
So to begin the discussion, let me look at the Fourier Transform and the Periodicity Transform of the SIDC sunspot data. In the H&W2011 paper they show the following figure for the Fourier results:
In this, we’re seeing the problem of the lack of resolution in the Fourier Transform. The dataset is 50 years in length. So the only frequencies used by the Fourier analysis are 50/2 years, 50/3 years, 50/4 years, and so on. This only gives values at cycle lengths of around 12.5, 10, and 8.3 years. As a result, it’s missing what’s actually happening. The Fourier analysis doesn’t catch the fine detail revealed by the Periodicity analysis.
Figure 4 shows the full periodicity transform of the monthly SIDC sunspot data, showing the power contained in each cycle length from 3 to 88 years (a third of the dataset length)… Figure 5 shows a closeup of the cycle lengths from nine to thirteen years.:
Note that in place of the single peak at around 11 years shown in the Fourier analysis, the periodicity analysis shows three clear peaks at 10 years, 11 years, and 11 years 10 months. Also, you can see the huge advantage in accuracy of the periodicity analysis over the Fourier analysis. It samples the actual cycles at a resolution of one month.
Eschenbach doesn’t even realize that the main reason for the different results between Fourier and periodicity analysis is that they’re done on different data sets. The Fourier analysis is of 50 years of data while the periodicity analysis is of over 250 years data. He also doesn’t realize that “It samples the actual cycles at a resolution of one month” is a limitation of the periodicity transform, not of the Fourier transform. With Fourier analysis we can sample actual cycles at any desired resolution — although the data don’t really support resolution as high as Willis seems to believe.
Let’s compute the DFT (discrete Fourier transform) of the full sunspot data, and compare it to the periodicity analysis. For this comparison I’ll plot a scaled version of the amplitude of the DFT rather than the power, since that’s on the same “scale” as what Eschenbach calls the “power” from his periodicity analysis.
If Willis had bothered to compute Fourier analysis of the same data to which he applied periodicity analysis, he’d have seen right away that the results are directly comparable. But — if he truly understood what he was doing, he’d have known from the get-go not to compare the results of two different period analysis methods by applying them to two different data sets.
Suppose we compare Fourier and periodicity analysis of a purely sinusoidal signal. I’ll use a signal with period exactly 11 years, observed at the same times as the sunspot data. They look like this (periodicity analysis in black, Fourier analysis in red):
At the main period, the results are again directly comparable. But for higher periods, the periodicity analysis shows other peaks which Fourier analysis does not. In one sense this is correct because if a signal is periodic with period 11 years, then it’s also periodic with period 22 years, 33 years, in fact every integer multiple of the “fundamental” period 11 years. but in another sense, the Fourier result is superior because there’s no 22-year periodicity which isn’t just a copy of the 11-year periodicity. Fourier analysis has responded only to the fundamental periodicity, while periodicity analysis has also responded to all its multiples — which for practical purposes are actually aliases of the fundamental period.
Note also that as the period gets longer, the peaks in the periodicity analysis get wider. The same would be true of Fourier analysis if there were signal at those periods. This is because the fundamental period resolution of a period analysis (not just Fourier analysis) gets wider as the period gets longer. I emphasize that this is a fundamental property of signals in general, not a specific property (and certainly not a drawback) of Fourier analysis. It happens to “periodicity analysis” also, in fact it happens to every period analysis method. It’s the nature of the information content in the signal, not of the method used to detect periodic behavior.
When we transform from the “time domain” to the “frequency domain” or alternatively the “period domain,” we find that the information content is evenly distributed in the frequency domain — not in the period domain. That’s why it makes much better sense to plot frequency rather than period as the x-axis variable in a period analysis. If we do so for the artificial sinusoidal signal, we get this:
Note that the “bumps” in the Fourier analysis are all the same width. which reflects the fact that they all account for the same amount of “information” in the frequency domain. The bumps in the periodicity analysis get narrower as frequency decreases, but that’s just because those bumps are false information. They don’t indicate any periodic behavior which is anything more that a copy of the fundamental period.
Note also that the periodicity analysis shows response at the true signal frequency f and all its integer divisors f/2, f/3, etc. Those frequencies are subharmonics of the signal frequency, and it’s a property of periodicity analysis that it shows subharmonic response which doesn’t represent unique information about periodic behavior. Advantage: Fourier.
That’s not to say that there aren’t situations in which periodicity analysis doesn’t have advantage over Fourier analysis. Fourier is best for detecting a sinusoidal signal. But for a distinctly non-sinusoidal signal other methods may have greater statistical power. A common case is that of the light curves of a type of variable star known as eclipsing binaries. In such systems, one star regularly eclipses the other, leading to a brief (but often exteme) dimming of the system. The shape of the light curve is decidedly not a sinusoid! This weakens the statistical power of Fourier analysis for detecting periodicity, because the signal power is spread over many harmonics in a Fourier series expansion.
A periodicity analysis, however, is not tied to a particular signal shape so it’s equally adept at detecting all cycle shapes. But it does suffer from the fact that the signal power is spread over the many steps in the cycle. Nonetheless, for very non-sinusoidal waveforms it often has better statistical power for detecting periodicity than Fourier analysis.
That’s why there’s a popular method for period search in astronomy which is closely related to the periodicity transform. The idea is to “fold” the time axis using some trial period, so that data values separated in time by an integer multiple of periods will align. The distance along the “cycle” — which is called the phase and ranges from 0 at the beginning of the cycle to 1 at the cycle’s end — is what we use to determine which data points “line up.” For an annual cycle in temperature data, for instance, the phase would be the month. It’s the same for all Januarys, for all Februarys, etc., but is different for data values from different months.
We can then measure how well the “folded” data (for variable stars, called a “folded light curve”) tend to match up. There should be very little dispersion at a particular phase value, even though there may be large dispersion between different phase values. Finding the period which gives the smallest dispersion at individual phase values is a period search method called phase dispersion minimization, or PDM.
We can even use PDM to test periods which are not an integer multiple of the time spacing. In fact we can use PDM to analyze data for which there is no regular time spacing. We simply divide the range of allowable phases (from 0 to 1) into intervals called “bins.” One common choice is to use 4 bins, so bin 1 covers phases from 0 to 0.25, bin 2 from 0.25 to 0.5, etc. We then compute the dispersion within each bin, and sum the within-bin dispersion for all bins to get our test statistic. The period which gives the minimum dispersion is our best estimate of the signal period.
Periodicity analysis as applied by Eschenbach is in fact a form of PDM, in which the number of bins depends on the period being tested. There’s one bin for each “step” so the number of bins is equal to the period itself (in units of the fundamental time spacing). This requires that the folded time values (the phases) actually line up, so the trial period has to be a multiple of the time spacing. It also introduces a statistical complication due to the fact that each period we’re testing has a different number of phase bins. With classic PDM, it’s customary to use the same number of bins for each trial period, and the trial periods don’t have to be a multiple of the data’s time spacing. Advantage: PDM.
It was not that long ago that Alex Schwarzenberg-Czerny (whom I consider to be the best data analyst in astronomy today) identified “the” correct statistical treatment of PDM, to treat it as a form of ANOVA (the “ANalysis Of VAriance”). He also identified efficient methods of computing it, with the idea of applying it as rapidly as possible to very large astronomical data sets — these days, satellite missions often return truly vast amounts of data, one might even say “astronomically” large.
Another strategy is to enhance Fourier analysis itself by allowing it to use multiple harmonics in its period search. Using multiple harmonics fits the data with a Fourier series, which increases the statistical power for non-sinusoidal cycle shapes. This methodology was brought to a high state of efficiency, again by Schwarzenberg-Czerny, who paid careful attention not only the the theoretical subtleties (I do too) but also to the computational burden on the computer (something which I admit is my weakness).
So in fact there are many period analysis methods, although “straight” Fourier analysis remains one of the most popular and is certainly one of the best. The “periodicity analysis” as applied by Eschenbach is, in my opinion, an inferior version of straight PDM. In fact the real advantage of the “periodicity transform” probably lies in applying it to decompose a signal, rather than using it to detect periodic behavior.
Period analysis is one of the weaknesses of scientists in general. For example, in the WUWT post the very first comment comes from Leif Svalgaard, in which he claims there’s a 100-year periodicity in sunspot data. When Eschenbach asks his support for that claim, Svalgaard first says “It is visible to the naked eye” then adds “or even FFT analysis,” pointing to this:
Unfortunately (for Leif) that “peak” at 100 years is a low-frequency response due to the fact that the noise in sunspots counts isn’t white noise. If we look at the DFT and test significance allowing for autocorrelated noise, we get this:
Svalgaard’s 100-year period is the peak on the left at frequency 0.01. It’s not statistically significant, not even remotely. Even if you remove the constantly varying solar cycle and analyze the residuals, it’s still not significant. The fact is that if there is a 100-year period in sunspot counts, these data do not show it. But Svalgaard’s misunderstanding of period analysis is not uncommon, even among working scientists.
But the misunderstanding in Willis Eschenbach’s entire post is uncommon; it’s unusual to exhibit such an air of confidence and expertise while revealing such a lack of understanding of period analysis. I’ve only expounded on one of his claims comparing periodicity analysis to Fourier analysis, but his mistakes are many, in general his claims are way off the mark.