Sampling Rate, part 2

“Suppose we have a signal which is band-limited, say it’s limited to the frequency band from 0 to 0.5 cycles per day,” says the engineering professor to the class in digital signal processing. “If we observe this signal at regular intervals with a sampling rate which is at least twice the bandwidth — in this case, at least once per day — then we can use Fourier analysis to reconstruct the signal. We can even interpolate it to fill in the gaps. This is one of the most common applications of Fourier analysis in the real world — we observe a signal, then use its Fourier transform either to reconstruct the signal or simply to identify its Fourier components (and therefore its physical nature).”

“I’m afraid not,” replies the grad student in theoretical mathematics.

“Young man, I’ve been doing this for over 20 years,” retorts the professor. “It has tremendous practical application — you’d be surprised how much modern technology depends on this. Technology that wouldn’t work if we couldn’t do this. And that stuff works.”

“Well,” says the math student, “I have here some data for a physical signal which is band-limited, confined to the frequency range 0 to 0.4 cycles per day. I’ve observed it eleven times at a sampling rate of once per day — more than twice the bandwidth — from times -5 to +5. Here it is:”


“As you can see, all of the sample values are equal to zero. Can you reconstruct this signal?”

“That’s easy,” says the professor. “The signal itself is zero.”

“Not so,” declares the math student. “It’s actually given by the formula:”

x(t) = -0.03464155 + 0.07376986 \cos(2\pi t/11) - 0.09165243 \cos(4\pi t/11) + 0.1572167 \cos(6\pi t/11) + 0.9400574 \cos(8\pi t/11) - 0.0474995 \cos(10\pi t/11) - \cos(\pi t/\sqrt{2}).

“He’s right,” interjects the grad student in computer science, who is a super-fast typist as well as a skilled programmer and just happened to have her computer up and running at the time. “I just calculated that and the result at all the times of observation is zero, even though the signal is nonzero and is clearly band-limited to less than half the sampling frequency. Of course, if you observe it some more — say 301 times — you’d get nonzero data values that look like this:”


“The only values that are exactly zero are those from time -5 to +5. But for those eleven times, it’s zero.”

“In the real world,” states the professor, “we would sample a signal like that much more than eleven times. If you compute the Fourier transform of the sample of 301 values, you’ll clearly see the signal frequencies that are present.”

“I just did that,” says the comp-sci student. “It looks like this:”


“All the signal frequencies are there, but it’s a lot easier to see them when you plot it on a logarithmic axis:”


“As I said,” adds the professor, “in the real world we can reconstruct this signal given enough data, we can recover the physical signal frequencies and their amplitudes and phases, and that’s why the technology works.”

“I’m sure the technology works,” affirms the math student. “That’s because given enough data you’re able to reconstruct this particular signal with sufficient precision. But you can’t reconstruct the signal exactly — not even assuming it’s band-limited — unless you have an infinite amount of data. And in the real world we can’t get an infinite amount of data. In addition, I can construct a signal which is band-limited, is sampled 301 times at a sampling rate more than twice the band limit, but has the value zero at all those times.”

“Let’s be realistic,” replies the professor. In the real world, the chance of observing such a signal at exactly the times for which all its values are zero is vanishingly small. And I do mean vanishingly small. In the real world we can reconstruct signals by observing them for sufficient time at a sampling rate at least twice the bandwidth. We do it all the time, and that’s how our technology works. And from a theoretical perspective, an infinite data series will reconstruct a band-limited signal exactly.”

“But,” says the math student, “even with infinite data you still have to know a priori that the signal is band-limited, and you have to know what band it’s limited to. After all, the Fourier transform extended beyond the Nyquist frequency looks like this:”


“Is the signal band-limited from 0 to 0.5 cycles per day, or from 2.5 to 3 cycles per day? Or is it band-limited from 0 to 3 cycles per day? Unless you know the band limit ahead of time you can’t reconstruct the signal. Besides, in time series analysis we almost never encounter signals which are exactly band-limited and even when we do we don’t know what the band limits are.”

“But in digital signal processing,” says the professor, “we do encounter band-limited signals and we do know what the band limits are. So we can reconstruct the signal, we can even do so with finite data to sufficient precision.”

“And in case you hadn’t noticed, this is not a class in time series analysis, it’s digital signal processing.”

… … …

“When you observe at regular time intervals, you can’t reliably identify high-frequency oscillations unless your sampling rate is at least twice as high as the maximum frequency which represents the signal,” says the professor to the class in time series analysis in astronomy.. “That’s because the window function for regular time sampling (in this case, once per day) looks like this:”


“Since it repeats over and over again, the Fourier spectrum itself gets repeated over and over again. A real signal frequency — say, at frequency 3.073812 cycles per day — will be replicated exactly at intervals equal to the sampling frequency. Those replicates are called “aliases” of the real signal frequency. Without some other information we can’t tell which alias is the real signal frequency.”

“But with uneven time sampling, you can identify and quantify such high-frequency signals in spite of an average sampling rate that’s much too low. That’s because the window function might look like this, for a set of unevenly spaced observation times of the variable star OGLE 13398 (from the Optical Gravitational Lensing Experiment):”


“A real signal frequency still gets replicated, but each replicate is weaker than the physical component. We can see this in the discrete Fourier transform of the actual data (here’s the amplitude spectrum rather than the power spectrum):”


“This tells us that the main signal component is at frequency 3.073812 cycles per day. But all by itself it doesn’t really tell us enough about the signal. The impressive part is that because each alias is weaker than the real signal component, we are able to identify which is real and we can then apply other methods which we’ll learn about soon to remove its aliases from the spectrum. When we carry this procedure to its logical conclusion, we’re left with this:”


“This does tell us about the real physical signal. It oscillates with a primary frequency 3.073812 cycles per day (period 0.32533 days) but with harmonics of the fundamental frequency so the waveform is not perfectly sinusoidal. More interesting, the two nearby frequencies which flank the main signal frequency modulate the main oscillation, with the modulation itself being periodic. The modulation period is 13.17 days. This is common behavior for variable stars of RR Lyrae type which happen to exhibit the Blazhko phenomenon, that their oscillations are modulated and the modulation is periodic with the Blazhko period.”

“But,” interposes the grad student in electrical engineering, “you still need to have your mean sampling rate be at least twice the bandwidth. And of course you need to know what the band limits are.”

“Not so,” the professor responded. For these data the mean sampling rate is only 0.0955 samples per day, much smaller than the bandwith which happens to be over 6 cycles per day. In fact the maximum sampling rate is 0.9908 samples per day, again far smaller than the bandwidth. And we’re able to eliminate the aliases so that even if the signal is band-limited we wouldn’t need to know ahead of time where the band is.”

“I don’t believe it,” the EE student stated flatly. “This is just a contrived example. This method might work sometimes but it will also fail, maybe even most of the time. You just can’t do that.”

“But I’ve just done it,” replied the professor. “And this isn’t a contrived example. It’s real data for a real variable star in the bulge of our own galaxy. And it’s far from the only example, there are thousands of such cases and in every one we’re able to recover the real signal because of the uneven time sampling, within the limits imposed by the size of the noise.”

“I’ll bet I could construct an artificial signal which would be nonzero, and even band-limited, but would give all zeros on this particular set of observation times,” quoth the EE student.

“I’m sure you could,” said the professor. “So could I. But in the real world, the chance of that happening is vanishingly small. And I do mean vanishingly. In the real world, we really can quantify signals with frequencies, and bandwidths, that are vastly larger than the maximum sampling rate let alone the mean sampling rate. It’s not exceptional or contrived, it’s routine. We’ll be doing it a lot in this class.”

“Yeah, well, I still think you shouldn’t look to time series analysis to prepare yourself for digital signal processing in the real world,” replied the EE student.

“Perhaps so,” said the professor. “But I’m certain of this, that you shouldn’t look to digital signal processing to prepare yourself for time series analysis.”


32 responses to “Sampling Rate, part 2

  1. -Look! A flock of black swans.
    -Sorry. You just missed them.
    -(suspiciously) What? Very fast for swans.
    -(wearily) You have no idea.

  2. I think the difficult part for some people is this:

    “I’m sure you could,” said the professor. “So could I. But in the real world, the chance of that happening is vanishingly small. And I do mean vanishingly.

    The odds that your irregular / quasi-random sampling will exactly line up with the roots of a periodic function are, indeed, vanishingly small. But why is this the case?

    ‘Cause the sampling is quasi-random. I can’t explain it any better than that, short of drawing a lot of graphs and equations. (The actual probability will depend on the shape of the underlying function and the irregularity and frequency of your sampling).

    While this point seems trivially obvious to me (a mathematician), I’m sure some people will get hung up on it, and they’ll probably still assert that the sampling is insufficient, no matter what you do.

    PS – Another way of explaining this: for any finite amount of sampling, you can always find a function that isn’t captured by the data. I can prove this mathematically.
    However, as the sampling increases, the amount of functions you can capture will also approach infinity – so, we’ll capture a larger and larger % of all realistic functions, even if we never capture *all* of them. Even if we capture 99.999999% of them, you could still devise a function that fails – it’s just extremely unlikely to actually represent a real function in the real world.

    In other words, since you can always data mine to find a “hole” in our data, the fact that you found one doesn’t really mean much. It’s just data mining.
    Likewise, it highlights the importance of providing physical mechanisms that could produce that function. If there’s no plausible physical mechanism, you’re probably just mathturbating.

    • Perhaps an easy way to put some numbers to it is to think about the simplest non-zero wave pattern you could create, just a single sinusoid; there is an infinite number of frequencies that could all give the same values (say, zero) for the times you observe, if you’re sampling regularly. These will all be multiples of the lowest frequency that can fit, which will have a wavelength twice the distance between any two observation points.

      This multiplicity is due to the fact that you only have one number you’re trying to factorize into the frequency – half the distance between the samples. There are MANY numbers that are divisible by, say, 3.

      Now sample irregularly, perhaps the distance between your samples is 13, 18.71, 14.5, 10.98, and 18.14. These share no common factors large enough to be of relevance.

      The lowest frequency that can give the same value at each of these is going to be proportional to the product of these numbers (I think, half of the product?). The number of numbers that are a multiple of the least common multiple of these numbers?

      Vanishingly small. And thus outrageously high frequencies can be resolved.

      • A correction I think? And if someone could please help verify that would be great –

        Sample semi-regularly, perhaps switching between 1 seconds and 3 seconds. You can resolve frequencies lower than 1/2 Hertz, but 1/2 and multiples of that will display aliasing.

        Sample at 1 and 4, and the same thing happens; so what I said before was incorrect.

        I think what you need to look for is what the highest common factor of your sampling spacings is. For each of the above examples it is 1; and the wavelength that starts the aliasing is 1/2 Hertz. The equation to relate the sampling spacings to the frequency that starts to alias is, I think,

        F = 1 / (2*fact)

        where “fact” is the largest common factor of the spacing between the sample points. If you sample continuously every 3 seconds (1/3 Hertz), the aliasing starts at 1/6 Hertz (f = 1/(2*3) = 1/6); if you sample continuously at 1 second intervals, 1/2 Hertz displays aliasing (f = 1/(2*1) = 1).

        Now let’s sample every 8 and 9 seconds, switching; the highest shared factor is 1, so the frequency that starts to alias is again 0.5 Hertz; but, you are sampling at a obviously lower rate, a mean 0.118 Hertz. So you can resolve frequencies a bit over 4 times the frequency at which you’re observing.

        If you go back to the spacing in my previous comment, like I said they share no large factors at all; but that’s the important part, not their least common multiple. At least 0.01 factors into all of those, so you can perhaps get frequencies that are smaller than 1/(2*0.01) = 50Hz. And from a sampling rate that has a mean of about 1/15 Hertz (somewhere around that).

        [Response: For random sampling there won’t be any common factors or multiples at all. You’d have to have all the time intervals in rational-number ratios to each other, which for a random sample has probability zero. If any single interval is irrational (like the square root of 2) then exact aliasing cannot occur.]

    • Susan Anderson

      It seems trivially obvious to me too, and I don’t have much math training. However, the difference between none and not much, or professional training consumed without curiosity and with an agenda, is huge. It would help if people could regard that kind of learning as being as important as watching Dancing with the Stars or learning Gangman Style; things like that. The cleverness with which hatred of learning is promulgated to the unwary is scary and revelatory. If people could understand that this stuff is the most important knowledge to be had here and now, they’d do the necessary.

    • Susan Anderson

      The technical discussions are no doubt enlightening to a very small group of people, but my point is people like me don’t need to understand that. It might help to understand the level of expertise that makes it possible for you all to do that. What they do need to know is that random sampling is random, and the stuff they’re hoping might be there would have to be manufactured to specification to escape the net of time, and stop assuming there’s a conspiracy of undetectable magnitude and cleverness to deceive them around every corner.

    • “However, as the sampling increases, the amount of functions you can capture will also approach infinity – so, we’ll capture a larger and larger % of all realistic functions, even if we never capture *all* of them. Even if we capture 99.999999% of them, you could still devise a function that fails – it’s just extremely unlikely to actually represent a real function in the real world.”

      I’m not sure how the number of functions you can reconstruct varies as a function of the number of samples captured, but the space of all possible functions is C, while the number of samples you can possibly capture will only ever approach aleph-0, so my gut tells me that the percentage of functions you can reconstruct will always be exactly 0% no matter how many samples you take. (Guts are poor substitutes for brains, however. :)

      But you did qualify it to say “realistic functions”, so the point is well-made — in the real world, processes will tend to be periodic, and therefore randomly varying observation times will give you more bang-for-the-buck; intuitively, it seems less likely that a real-world physical process will just happen to have all the right frequencies to be missed every time by random sampling times than it would by a fixed interval.

      BTW, excellent post, Tamino.

  3. Jeff- ha ha. Tamino- super DUPER interesting, will have to dig in. I am glad you posted this- I know it relates to the previous comment thread, and I am glad you took the time to extend the discussion, though I know you were probably a little peeved before.

  4. Nice post. I especially liked:
    ‘“Perhaps so,” said the professor. “But I’m certain of this, that you shouldn’t look to digital signal processing to prepare yourself for time series analysis.”’

    See Bell Labs single-chip DSP, an effort led by Dan Stanzione, my Director for a few years, and later BTL President. Our lab used a lot of DSPs.

  5. so, ah, this proves that climate variations are most likely being driven by hostile aliens, since only malignant intelligent design could be responsible for the mysteriously hidden-in-the-gaps spikes that escape detection by our best analysis, because we know They Must Be There and caused by Anything But CO2 ….?

  6. Unless, as is often the case, I’m misunderstanding something the cos(2*pi*t/sqrt(2)) term isn’t band limited to 0.4/t.

    I think you meant cos(2*pi*t/(2*sqrt(2))).

    [Response: You’re quite right. It was a typo in the blog post, not the computer scripts. It’s fixed.]

  7. I read this a few times and didn’t quite get your point. I guess I still disagree with original concept of your post that

    1. Sampling requires a rate at 2x the highest frequency for uniform sampling.

    This is absolutely not true. Sampling only requires 2x the bandwidth of the signal defined as the distance between the lowest non-zero frequency and the highest. For a tone this bandwidth is 0. This is the mathematical definition behind the sampling theorem and easy to verify on Wikipedia.

    [Response: That’s just not so. You can only restrict to the bandwidth if you know where the band is. Otherwise the aliases (with even sampling) are perfect and you don’t know which one is real.]

    I stumbled onto this post, and responded mainly as a curiosity and have no idea what the background or reason for the post was. I am not arguing that non-uniform sampling is invalid just that it too must follow the nyquist criteria and in the example you are showing which is effectively a tone it does.

    [Response: You’re mistaken again. Even if we ignore the harmonic, the bandwidth for the main oscillation is 0.3 cycles/day while the mean sampling rate is less than 0.1 observations/day. But the method works (not just in this real-world case but in thousands) despite your insistence that it can’t.]

    • I’ve never once said your method will not work. In fact, I absolutely agree that it will work with a sinusoid as the input as it’s bandwidth is effectively 0. If you were to put a random bandlimited signal rather than a sinusoid with a maximum frequency of .3 cycles per second it would definitely not work.

      I more than agree that there are real world examples of sampling far less than the highest frequency value of the signal, and even designed a lot of IF sampling systems using a technique exploiting this. This again is because the nyquist rate is 2x the bandwidth not 2x the highest frequency.

      Bandwidth is defined in the link below as “Bandwidth is the difference between the upper and lower frequencies in a continuous set of frequencies”

  8. My window into this is electronic music–particularly the digital sort, where Nyquist is a real consideration. (And by the way, aliasing can be highly, er, ‘auditorally dramatic’ when you inadvertently exceed the Nyquist frequency in coding some ‘cool’ computer piece! Of course, back then the thing was actually run on the University mainframe and you had to wait a business day or so to discover you’d blown it.)

    Anyway–just how strictly periodic does the signal have to be in relation to the sampling? It’s one thing to reconstruct 70 minutes of 60 Hz sine wave, quite another to reconstruct, say, Beethoven’s 9th.

  9. So are any of the protagonists in your entertaining dialogue likely to bring you before the inquisition?

  10. David B. Benson


  11. Philippe Chantreau

    Very interesting. The true depth of it escapes me but the elegance doesn’t. I guess that’s perhaps the most seductive, and treacherous, aspect of maths…

  12. Gavin's Pussycat

    Let me see if I understand this correctly. You have a function defined on the real (time) axis, with a frequency content bounded from above. Sampling it periodically over a finite interval, you cannot reconstruct it, even if the sampling rate meets the Nyquist criterion. Right?

    But what if the function is defined on the circle — i.e., we know it to be periodically repeating — and it is periodically sampled around the full circle, at a rate meeting the Nyquist criterion. Then the function is exactly reconstructible, right?

    [Response: In that case the Fourier transform becomes a Fourier *series*. Yes you can reconstruct it, so long as it is a finite Fourier series and you have enough data samples to determine all the parameters.]

    • Gavin's Pussycat

      To add to my comment, I computed the test function x(t) for [-5,5] at steps of 0.1 days. I found that over the whole interval it lies between -0.028 and 0.072. So for the actual sampling interval the statement by the good professor:

      “That’s easy … [T]he signal itself is zero”

      would appear to be practically correct ;-) and would presumably be exactly correct if periodic boundary conditions were imposed…

    • Gavin's Pussycat

      Ah yes, the octave script is here:

      t = [-5:0.1:5];
      x = -0.03464155+0.07376986*cos(2*pi*t/11)-0.09165243*cos(4*pi*t/11)+0.1572167*cos(6*pi*t/11)+0.9400574*cos(8*pi*t/11)-0.0474995*cos(10*pi*t/11)-cos(pi*t/sqrt(2));

      It appears to me that the variable-star technique works because, although the sampling density is well below Nyquist, the sampling period is long compared to the period of the process being sampled; if you ‘fold’ all the data on to one such process period, the resulting effective sampling density will be above Nyquist.

      It’s another application of the principle that, in order to determine the unknowns of a problem, you need at least as many independent conditions as there are unknowns. It’s the randomness that makes these samples independent.

      • if you ‘fold’ all the data on to one such process period, the resulting effective sampling density will be above Nyquist.

        This is very insightful observation. It clearly shows why it works.

      • Or, in my example, you aren’t going to be able to reconstruct Beethoven’s 9th, not unless you sample at least to the Nyquist frequency.

        But–and correct me if I’m wrong–even the (IIRC) very regular cycles of variable stars aren’t precisely the same. So there’s some ability, presumably, to capture or at least tolerate deviations from cycle to cycle using the irregular sampling method we’re discussing. (My earlier question was prompted by curiosity about this–that is, about the parameters governing analysis of signals with changing wave shapes over various time scales.)

      • Horatio Algeranon

        you aren’t going to be able to reconstruct Beethoven’s 9th, not unless you sample at least to the Nyquist frequency.’

        Would that be the “Ninthquist frequency”?

  13. Excellent. This is a Socratic dialogue is it not, unless my non-existent classic education escapes me? There should be more of it. Such clarity.

    • Yes–though we probably should credit Plato, really. He wrote the earliest examples, IIRC–but he put the ‘best parts’ in the mouth of Socrates, his mentor (hence the ‘Socratic’ tag.)

      These kinds of dialogs remained a popular and effective strategy for exposition of all kinds of theoretical (or polemical!) matters well into the 16th or 17th century, and were notably revived in Douglas Hofstadter’s “Goedel, Escher, Bach.”ödel,_Escher,_Bach

  14. A question: For all of this you need a t=0 marker, yes? Obviously if it jitters you broaden the peaks, but if it moves periodically you are going to get some misleading results.

  15. Now…. as one of those electrical engineers that have been well and truly humbled by this post, what books/materials do you suggest as a first course into non-uniform sampling?

    As we speak, I have several problems where such a technique might be useful and I have limitations on the sample rate I can use.

    [Response: I don’t know of much in the ways of books that cover this. But very soon (by the end of this month or soon after) I’ll finish my next book, “Introduction to Fourier Analysis and its application to Time Series” and I’ll be sure to post when it’s available. I guess it’s what I would recommend.

    I’m confident there are many situations in which the “time series guys” would be well and truly humbled by the EE guys. Maybe humility is a good thing in general …]

  16. graphicconception

    Here is an interesting link that also has pointers to some documentation:

    My background is also Electrical Engineering and once, a long time ago, I gave some lectures on the basics of time series analysis as it related to the processing of signals from rotating machines. I came across the idea of random sampling during my research.

    “How could it work?” I asked myself. I knew something about information theory so looked to see where the information could come from. The trick is in the word “random”. Assuming baseband transforms, the highest frequency is a function of the sample spacing. This is just what Nyquist said. The same applies for random sampling too, but if the samples really are random then some of them will be very, very close together.

    The reciprocal of that “closeness” will give you the high frequency that random sampling can resolve.

    Obviously, there will still be tradeoffs and I do not know what those are. Also, the properties of the spectrum will depend on just how “random” the sample spacings are.