Open Mind

Cycles

April 15, 2008 · 22 Comments

One of my specialties is period analysis, identifying and quantifying periodic and pseudo-periodic fluctuations in time series. The essence of periodic fluctuations is that they repeat. It’s not just that the signal forever goes up and down, possibly in random fashion; there’s a pattern to the ups and downs, and that pattern repeats in cycles — not necessarily perfectly (in the case of pseudoperiodicity), not even necessarily forever, but repeats often enough and regularly enough that we can, at least in the short term, have some idea of what the next cycle will look like, given that we know the most recent cycles.

This may end up being one of my more controversial posts, because I’m going to expound on one of my “pet peeves” in science: the claim of periodic or pseudoperiodic behavior when none exists. It might just be the most common mistake made by working scientists, and it makes its way into the peer-reviewed literature all the time. It just goes to show that peer review is a necessary, but hardly a sufficient, condition for correct results.


One of the most fertile grounds for the misidentification of periods is study of the sun. A reader recently linked to a number of publications, some peer-reviewed some not, which claim a period of about 179 years in solar observations. This is supposed to be related to the sun’s movement around the solar system’s center of mass (its barycentric motion), which leads to the theory that this motion modulates solar activity. There may indeed be a 179-year (or thereabouts) period in the sun’s barycentric motion. The problem is that I haven’t found any evidence yet that such a period exists in solar activity.

The original reference given for a 179 yr period in solar activity seems to be a study of sunspot numbers by Jose (1965, Astronomical J., 70, 193). Jose in fact identifies a period of 178.7 yr in the sun’s barycentric motion, then goes looking for the same period in sunspot data. To establish, it, he offsets sunspot data by 16 cycles, which is about 179 years. Then he notes that earlier and later cycle timings can be “paired,” giving this graph (click for a larger, clearer view, although the original isn’t very clear):

There are a few things to note about this. First, he plots alternate cycles of sunspot numbers as positive and negative numbers to emphasize the change in magnetic polarity from one cycle to the next. Second, he uses a different system for numbering the cycles than the modern convention.

On this basis Jose claims a 178.7-yr period in sunspot data; he simply states, “the times of sunspot maxima can be paired over two full periods if the sunspot curve has the same 178.7-yr period.” This quite ignores the fact that the times of sunspot maxima 16 cycles apart can be paired whether there’s a 178.7-yr period in sunspot data or not. All he has really shown is that 178.7 yr is close to a multiple (16 times) of the average length of the sunspot cycle. But it turns out that the timings of sunspot maxima paired off in this way don’t really match up very well. Unfortunately Jose doesn’t apply any statistical test to measure the quality of the match.

For a brief time span (about 35 years) Jose has actual sunspot data 179 years apart. The match for those few cycles looks pretty good on his graph. But since that publication, sunspot numbers have been recovered further back in time, and we’ve observed several more cycles. Here’s what we get if we compare sunspot numbers past and present, offset by the purported 179-yr cycle, using available sunspot data from 1700 to 2006:

The match is quite poor. Cycles earlier and later than those available to Jose are well out of phase with their 16-cycle offsets; in fact by the time we get to the most recent cycle, it’s just about 180 degrees out of phase with its 16-cycles-earlier predecessor. And not just the phases, but the amplitudes of the cycles 16 cycles apart don’t match up very well either. Jose’s original claim was based on assuming what he wanted to show in the first place, he applied no statistical test to confirm the result of his method, and subsequent data flatly contradicts his conclusion.

I went looking for other papers supporting the existence of a roughly 179-yr period in sunspot data. The most recent I found is Rogers et al. 2005, Long-term Variability in the Length of the Solar Cycle, submitted to the Astronomical Journal. Apparently it hasn’t yet been accepted for publication, all I found is a preprint.

This paper is an object lesson that faulty statistical analysis is all too common in the work of professional scientists. They apply two period analysis methods to sunspot data, Fourier analysis and “Phase Dispersion Minimization” (PDM). PDM seeks to identify periods by “folding” the data on a trial period (by computing phases for a given trial period), then testing whether or not the data points of the same phase show the same value. Take for example a perfect sinusoid with period 40 years:

If we now assign a phase to each point (”fold” the data) based on the correct period, then points at the same place in the cycle (same phase) will align perfectly. When we plot the data as a function of phase, the cycles will align perfectly; there’ll be no “dispersion” in phase. But if we fold the data using a slightly incorrect period, say 41 years, then each cycle will be slightly offset and we’ll note phase dispersion:

If we fold the data with a very incorrect period, say 45 years, then each successive cycle will be strongly asynchronized, and the phase plot will show what looks like a nearly random scatter with lots of dispersion:

The essence of PDM is to test a large number of trial periods and compute the phase dispersion (usually called \theta). If the data are perfectly periodic and we have the exactly correct period, the phase disperion will be zero. If the trial period is nowhere near correct, the phase dispersion will be near one.

Here’s the result of Rogers et al. applying Fourier analysis and PDM to sunspot data, using daily, monthly, and annual values for the sunspot cycle (click for a larger, clearer view):

The 11-year Schwabe cycle is evident in all the analyses. But Rogers et al. also note a response in the PDM test (but not the Fourier analysis) at a period near 22 yr. They attribute this to the 22-year Hale cycle, claiming to have detected it in sunspot data. They suggest that one of the differences between PDM and Fourier analysis for identifying periods is that PDM is better for detecting periodic fluctuations that have shapes which are distinctly different from a sinusoid.

The problem is, their claim to have found a 22-year cycle really only indicates improper understanding of the properties of PDM. Let’s take our artificial perfect-sinusoid data and fold it, not using the correct 40-year period, but using a period of 80 years:

There’s no phase dispersion, despite the fact that there’s no 80-year periodicity in the data at all! That’s because if data are perfectly periodic, then not only will each cycle match perfectly with the cycle one period later, it will also match perfectly with the cycle two periods later. And three periods later, and four, and so on. These are the subharmonics of the real period. It’s a property of PDM that it tends to detect not only the true period, but its subharmonics as well, whether or not they’re present in the data.

For data that aren’t perfectly periodic, each higher subharmonic will give weaker response, so we’re likely only to detect one or possibly a few subharmonics. But any that are detected are likely to be due to this peculiarity of PDM, not to genuine periodicity at two or three or more times the true period. The response of the PDM test to sunspot data at period 22 yr isn’t actually a sign of a 22-year period; it’s just PDM detecting the first subharmonic of the real period. Apparently Rogers et al. weren’t aware of this.

Let’s take a look at their detection of a period near 180 yr. For this they take the time difference between sunspot maxima, and the difference between sunspot minima as well, to study estimates of cycle lengths. They claim to find period change, not in raw sunspot counts, but in the lengths of the sunspot cycle. If correct, the “period” they claim is around 180 years, or about 16 solar-cycle periods.

But they don’t directly analyze solar cycle length data, instead they use two less direct ways. One is to reduce the number of solar cycle length estimates by finding the median value over very wide spans of time. This is problematic because it reduces the number of data points, and there are already precious few; we only have cycle length measurements for 35 solar cycles. Nonetheless, they take the median value in time “bins” and fit a sinusoid to those reduced data. To get an idea how the width of the bin might affect the analysis, they try various values for the width of the bins, ranging from 40 years to 90 years.

They use the period of a best-fit sinusoid to these “median trace values” as an estimate of the underlying period in solar-cycle length variations. Unfortunately, they don’t apply any statistical test to determine whether the sinusoids fit any better than random chance. This is because, after reducing the number of data by taking median values in bins, they’re left with too few data points to do such analysis. Here’s their graph; the red x’s are the estimated cycle lengths, the black dots are the median values of the bins, and the blue curves the best-fit sinusoids. The sinusoids are fit to the median values, not the estimated cycle lengths; note that by the time the bin width gets to be 90 yr, they’re left with only 4 data points to fit their sinusoid!

Note that the “period” estimated this way ranges from 157 yr to 393 yr. That’s quite a degree of uncertainty, especially for a set of sinusoidal fits which have no statistical justification! Nonetheless, they argue for the correctness of the 50-year bins or 60-year bins, because in those cases the “period” determined using solar-cycle length from minimum to minimum of the solar cycle, and that determined using length from maximum to maximum of each solar cycle, are in agreement. This narrows the “period” down to about the range 180 to 250 years. Again, there’s no statistical justification for their sinusoidal fits, nor can there be, as using these bin widths reduces the number of data points to a mere 7 (for 50-year bins) or 6 (for 60-year bins).

The other method they use is to do a period ananalysis (Fourier, in fact) on what are called “O-C” values, for “observed minus computed.” It’s a very common technique in astronomy to look for changes in the period of a cyclic phenomenon, and they’re looking for changes in the solar cycle length. One replaces the times of maximum, or of minimum, with their differences between what was observed (O) and what it would have been if the cycle length were perfectly constant (C), to get O-C. Then they subject these values to period analysis, and estimate a period of 188 years (about 17 solar cycles).

The O-C method is very good for testing whether or not the cycle length is deviating from perfect constancy. In this case, that’s hardly necessary; we already know that solar cycle lengths are not perfectly constant, not even close! They show quite a bit of cycle-to-cycle variations, some cycles have been recorded which are as brief as 7 years, or as long as 15 years. Such scatter in cycle lengths touches on a grave statistical danger using O-C when the cycle lengths are not perfectly constant, or at least varying smoothly. In such circumstances, going from cycle lengths to O-C values introduces very strong autocorrelation into the O-C values, which profoundly affects analysis of the O-C values. In particular, this can lead to the false conclusion of non-random changes in the cycle length, when in fact it’s the random nature of the cycle length changes that causes the strong autocorrelation in O-C values in the first place.

This was, alas, a very common mistake in astronomy until the statistical pitfall was driven home in the 1990s and statistically valid techniques were devised for dealing with O-C values under such circumstances (Koen & Lombard 1993, Mon. Not. Roy. Astron. Soc., 263, 287; Koen & Lombard 1993, Mon. Not. Roy. Astron. Soc., 263, 309; Foster 1993, J. Am. Assoc. Var. Star Obs., 22, 145). Very unfortunately, the mistake is still all too commonplace; the inappropriate analysis of O-C values is an ongoing problem for astronomers, and the work of Rogers et al. is yet another example. The period they claim is really nothing more than the manifestation of autocorrelated noise induced by taking O-C values.

The right way is to apply period analysis directly to the lengths of sunspot cycles, not to O-C values or to median values from wide bins. I’ll use cycle number rather than time as my “time variable,” it really makes no significant difference for the analysis. If there’s a period near 179 years, then we should note a response near a period of 16 cycles, or at a frequency of 0.0625 per solar cycle. Here’s the result:

There’s not even a hint of any significant response near frequency 0.0625 per solar cycle. In fact there’s no statistically significant response anywhere in this Fourier spectrum.

Theories supposing a relationship between the sun’s barycentric motion and solar activity, particularly those which emphasize the 179 year period in barycentric motion, offer only unquantified speculation about a possible physical mechanism between the two phenomena. Therefore they depend critically on observing the same period in both barycentric motion and solar activity. I haven’t investigated the barycentric motion, but I accept that the given period is present. However, I have found no reliable evidence, either in my own analysis or that of others, to show even a hint of such a cycle in solar activity. All claims I’ve seen of such a causal relationship are founded on either the kind of faulty period analysis which is, alas, a plague on modern science, or on the most outlandish speculation with no evidence at all. The former type, faulty analysis, I’ve resigned myself to as an unavoidable pitfall of the exploration of new ideas; there are so many analyses by so many researchers that there are bound to be mistakes. The latter type, completely unfounded claims, I classify as the work of crackpots.

There may be other data used to seek this period in solar activity, but I haven’t yet found any hint of that either. Clearly the foundation of such claims is analysis of sunspot data; perhaps the most sensible analysis I’ve seen of the search for cycles in sunspot data is from Lenhorst (1982, Solar Physics, 80, 379):


We investigate the possible existence of periodicities other than that of 11 years in the mean annual sunspot number. The power spectrum for the sunspot numbers for the years 1711–1966 is compared with power spectra for the same period in which the solar cycles have been permuted randomly. Peaks in the power spectra at periodicities other than 11 years are found to appear and disappear in a random way. We conclude that the 11 year periodicity is the only statistically significant one present in the sunspot data.

Having looked for periods in sunspot data myself, I quite agree.

Categories: Global Warming · climate change
Tagged:

22 responses so far ↓

  • climatewonk // April 15, 2008 at 1:53 pm

    It amazes me the lengths denialists will go to find the “anything but CO2″ “some yet-unknown solar effect” explanation for global warming. Unfortunately, the bunk they put out there for public consumption is rarely challenged in the same forums and so the faulty reasoning persists and the public remain confused, thinking there is serious disagreement among scientists about the cause of global warming. Too bad this kind of analysis only reaches those who know better and a few of the denialists and self-styled skeptics who come here. I can only hope the open-minded among them learn something.

    [Response: In this case, it's not just denialists pulling theories out of their ***es, it's also very common mistakes leading to incorrect identification of periods. And that's not unique to solar-climate connections, it happens in areas with no political or social implications.]

  • climatewonk // April 15, 2008 at 1:55 pm

    I should add a big “thank you” for the post.

  • BBP // April 15, 2008 at 2:10 pm

    From my days as aa astrophysics gradual student (gradually finding out I didn’t want to be a student), I think one of the dangers for astonomers is that in so many cases they know (for good physical reasons) that there is a period in sparse data, so they are taught how to pull out the period that has to be there. Unfortunately, this mind set can then carry over to cases where there may not be a period.

    [Response: I used to think it was a problem in astronomy, but having investigated other sciences I've discovered that it's a ubiquitous problem throughout science.

    And you're quite right, that believing a period is present *before* one has been established is a "danger point" in analysis. I might call it the "wishful thinking fallacy" -- but I'll bet there's already a name for it, probably a better one.]

  • steven mosher // April 15, 2008 at 2:36 pm

    Nicely done.

    You said “barycentric”

    This could get ugly.

  • Aaron Lewis // April 15, 2008 at 3:53 pm

    I think this post provides absolute proof that I need to revise my “rant” from: “We need to improve science education!” to “We need to improve science and math education.”

  • John Mashey // April 15, 2008 at 4:10 pm

    Great post. Thanks.
    I’ve added to my short list of things to point at.

  • Bob G // April 15, 2008 at 4:17 pm

    I think you may have come to a different conclusion had you evaluated period variations using two- and three-dimentional return maps. You should also have explored applying a smoothing spline regression for a given period, then find the periods which minimize generalized cross-validation. Just my two cents.

  • Christopher // April 15, 2008 at 4:25 pm

    I enjoyed that post a lot, many thanks, and also had a question: My understanding is that Fourier is not particularly suitable to non-stationary tie series. The sunspot data, however, would seem to be non-stationary. Is here some rule of thumb involved in when traditional Fourier analysis still works? I’m aware of some of the enhancements to Fourier to get around this issue but my understanding was always that wavelet analysis was the better bet in such a case. I was just curious as to your thoughts.

    [Response: Your question actually goes to the heart of the issue of exactly what is meant by "periodic." For strictly periodic phenomena, Fourier is about as good as it gets, although other methods (PDM, ANOVA) do have advantages when the cycle shape is strongly non-sinusoidal. But most physical phenomena are *not* strictly periodic; the sunspot cycle is pseudoperiodic, each cycle has a slightly different period, amplitude, and cycle shape.

    When the variations in the parameters of the cycle (period, amplitude, phase, shape) are not too great and the time span is not *too* long, Fourier (and other strict period) analysis is still excellent. It has even been adapted to quantifying pseudoperiodic phenomena. It's also possible to study, not the "periodogram" (basically, a raw analysis for the presence of periods), but the "power spectrum density" (PSD), which can give important clues about what time scales are present even for random processes; if a phenomenon is pseudoperiodic then the range of possible periods will exhibit higher PSD.

    However, for characterizing pseudoperiodic behavior when the "pseudo" is strong, I'm a huge fan of wavelet analysis. Its strength is that it directly estimates the changes over time of the parameters of the fluctuation; its weakness is that it reduces statistical power by focusing on more brief time spans. This is part of the eternal trade-off between time resolution and frequency resolution (in fact the degree of trade-off is a "tuneable parameter" in wavelet analysis).

    For seeking "long" periods (like the 180-yr period in 300 years of sunspot data), straight Fourier is a clear choice; if such a period exists then there aren't enough cycles to make good use of time-frequency analyses like wavelets.]

  • Jim Arndt // April 15, 2008 at 4:37 pm

    Here is Landscheidt explanation of the 179 cycle. It is interesting reading. It is however considered fringe thinking on this matter. Still gives some insight to this cycle.
    http://bourabai.narod.ru/landscheidt/extrema.htm

    [Response: This was already linked to in a previous thread. This paper is notable for making a lot of claims about statistical significance without giving sufficient information to confirm or deny. Lanscheidt is one of the people I had in mind when I used the word "crackpot."]

  • Ian // April 15, 2008 at 4:57 pm

    Thanks for the very clear post. As for the “wishful thinking fallacy” – in cognitive psychology, it’s called “expectancy confirmation.” Your prior knowledge and expectations guide your attention, and your interpretation of what you attend to, in ways that you take as unsolicited confirmation of your initial expectations. Often, instead of saying “I’ll believe it when I see it,” this is more like “I’ll see it when I believe it.”

  • dhogaza // April 15, 2008 at 5:20 pm

    My favorite line from the Wikipedia entry on Landscheidt …

    Landscheidt also received the Marc Edmund Jones Award which is considered one of the most prestigious awards in astrology

  • BBP // April 15, 2008 at 6:59 pm

    Another interesting case in atronomy of periodicity is for galaxy redshifts (See for example http://www.springerlink.com/content/qt7454133824p423/ ). To me it seems that for years the evidence is a firm ‘maybe’. But if there really is a periodicity it would be interesting to nail down exactly why.

  • Andrew W // April 15, 2008 at 7:17 pm

    On a lighter note, I bet nobody can beat this as an entry in the “most bizarre claim of cycles”.

    http://www.nzcpr.com/guest92.htm

    “best-fitted spline curve represents longer term temperature trends”

  • Hank Roberts // April 15, 2008 at 7:57 pm

    I dunno, Andrew, I’ve always been fond of this one, though I can’t pretend to analyze the statistics. It just smells funny to me:
    http://ks.water.usgs.gov/Kansas/pubs/reports/paclim99.html

  • Andrew W // April 15, 2008 at 9:22 pm

    Wow! Stiff competition, but sorry Hank, admittedly your link is bizarre in that it looks flash and official but I’m prepared to go in and bat for Bob Carter on this one. If you look at Bob’s graph, it doesn’t cover even one complete cycle, he gives no reason for the claimed cycles to exist and what he shows doesn’t even come close to matching solar cycles or anything else that I know of that could be a driver.
    And of course (eyeballing it) what he claims as a best fit is a lousy fit.

    Besides all that, it can’t be denied that Bob has been trying so hard for so long, surely that counts for something?

  • steven mosher // April 16, 2008 at 12:45 am

    “It just goes to show that peer review is a necessary, but hardly a sufficient, condition for correct results.”

    Tammy, peer review is not a necessary condition for correct results. It’s a good practice to rule out junk. It is NOT a necessary condition for correct results. You can’t prove that. Don’t even try.

    2+2 =4 would be a correct result EVEN IF it were not peer reviewed. EVEN IF there were no humans to peer review it.

    Peer review is neither necessary nor sufficient to preserve the truth function.

    That said, I wouldnt waste my time ( too much)
    on stuff that hasnt been peer reviewed.

    peer review is Pragmatic. It’s not necessary to preserve the truth function, not suffiecient to preserve the truth function, its practical.

    In short. Peer review says : “read this shit”
    “Don’t read this other shit”

    And the guys telling you what to read and what not to read have a good track record.

    That’s it.

  • John Goetz // April 16, 2008 at 1:27 am

    I’ve lurked here, read your paper from the mid-1990s, and used your software (I have dozens of suggestions for ease-of-use improvements if you are open to them). All I can really say is this was a fine post and one I am sure I will refer to time and time again.

  • frankbi // April 20, 2008 at 4:48 am

    Hmm… while reading Zhen-Shan and Xian’s bogus “global cooling” paper, I was wondering, has anyone had tried doing short-time Fourier transforms — with sliding windows — on climate data, to obtain some sort of spectrogram?

    – bi, International Journal of Inactivism

  • george // April 20, 2008 at 7:52 pm

    HB:

    solar cycle lengths are not perfectly constant, not even close! They show quite a bit of cycle-to-cycle variations, some cycles have been recorded which are as brief as 7 years, or as long as 15 years.

    That seems like quite a bit of variation to me.

    What role does “chaotic” behavior play in the sunspot “cycle”?

    Might such behavior at least partly determine the time of maximum sunspot number?

    I have read that some scientists believe the so called Maunder minimum might have been related to chaotic behavior (which, if true means it might be very difficult, if not impossible to predict when the “next Maunder Min” is coming)

    I don’t know whether it has or not, but if the sun has exhibited even “apparent” periodic behavior (over a longer time span than 11 years) from time that has not persisted, it is at least possible that chaos might be a possible explanation. Chaotic systems are known to exhibit period doubling, for example, but if the system is sensitive to small changes, the “periodic” behavior may not last.

    but I don’t claim to know about any of this. Perhaps someone who knows about the “chaos” angle might care to comment?

  • Leif Svalgaard // April 20, 2008 at 10:31 pm

    geroge: solar cycle length being ‘chaotic’. The usual definition of ‘chaos’ involves the notion that even the slightest change can send the system into a new state with quite different properties. This is not really the case with the solar cycle [we think]. There are theorists that believe that it takes 20-40 years to ‘build’ a new solar cycle deep within the sun. Over that time a lot of ‘averaging’ takes place and very small changes don’t have much influence against the accumulated effects of 20-40 years. Others [incl. myself] believe the solar ‘memory’ is much shorter [~5-10 years], but still long enough to dampen out small changes. So, the system is probably not chaotic, just random and irregular. The outer ~30% of the sun is convective and overturns constantly [like boiling water] and makes it hard to maintain regular, ordered processes.

  • Hank Roberts // April 24, 2008 at 4:29 am

    http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6VHB-4S1C8J3-1&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=2d11f40acb08e99efab43d89e822302a

    doi:10.1016/j.jastp.2008.02.003
    Short-term changes in global cloud cover and in cosmic radiation

    “… This paper reports an analysis of changes in global cloud cover and GCR recorded at 3 hourly intervals over 22 years. There is a significant correlation between short-term changes in low cloud cover over northern and southern hemispheres, consistent with about 3% of the variation arising from common factors. However, GCR is not a major factor responsible for cloud cover changes. There is an association between short-term changes in low cloud cover and galactic cosmic radiation over a period of several days. This could arise if approximately 3% of the variations in cloud cover resulted from GCR.”

  • Hank Roberts // April 25, 2008 at 4:22 am

    http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/ensoyears.shtml

Leave a Comment