55

MIT has been making a bit of noise lately about a new algorithm that is touted as a faster Fourier transform that works on particular kinds of signals, for instance: "Faster Fourier transform named one of world’s most important emerging technologies". The MIT Technology Review magazine says:

With the new algorithm, called the sparse Fourier transform (SFT), streams of data can be processed 10 to 100 times faster than was possible with the FFT. The speedup can occur because the information we care about most has a great deal of structure: music is not random noise. These meaningful signals typically have only a fraction of the possible values that a signal could take; the technical term for this is that the information is "sparse." Because the SFT algorithm isn't intended to work with all possible streams of data, it can take certain shortcuts not otherwise available. In theory, an algorithm that can handle only sparse signals is much more limited than the FFT. But "sparsity is everywhere," points out coinventor Katabi, a professor of electrical engineering and computer science. "It's in nature; it's in video signals; it's in audio signals."

Could someone here provide a more technical explanation of what the algorithm actually is, and where it might be applicable?

EDIT: Some links:

lennon310
  • 3,590
  • 19
  • 24
  • 27
nibot
  • 3,803
  • 5
  • 29
  • 40

4 Answers4

43

The idea of the algorithm is this: assume you have a length $N$ signal that is sparse in the frequency domain. This means that if you were to calculate its discrete Fourier transform, there would be a small number of outputs $k \ll N$ that are nonzero; the other $N-k$ are negligible. One way of getting at the $k$ outputs that you want is to use the FFT on the entire sequence, then select the $k$ nonzero values.

The sparse Fourier transform algorithm presented here is a technique for calculating those $k$ outputs with lower complexity than the FFT-based method. Essentially, because $N-k$ outputs are zero, you can save some effort by taking shortcuts inside the algorithm to not even generate those result values. While the FFT has a complexity of $O(n \log n)$, the sparse algorithm has a potentially-lower complexity of $O(k \log n)$ for the sparse-spectrum case.

For the more general case, where the spectrum is "kind of sparse" but there are more than $k$ nonzero values (e.g. for a number of tones embedded in noise), they present a variation of the algorithm that estimates the $k$ largest outputs, with a time complexity of $O(k \log n \log \frac{n}{k})$, which could also be less complex than the FFT.

According to one graph of their results (reproduced in the image below), the crossover point for improved performance with respect to FFTW (an optimized FFT library, made by some other guys at MIT) is around the point where only $\frac{1}{2^{11}}$-th to $\frac{1}{2^{10}}$-th of the output transform coefficients are nonzero. Also, in this presentation they indicate that the sparse algorithm provides better performance when $\frac{N}{k} \in [2000, 10^6]$.

enter image description here

These conditions do limit the applicability of the algorithm to cases where you know there are likely to be few significantly-large peaks in a signal's spectrum. One example that they cite on their Web site is that on average, 8-by-8 blocks of pixels often used in image and video compression are almost 90% sparse in the frequency domain and thus could benefit from an algorithm that exploited that property. That level of sparsity doesn't seem to square with the application space for this particular algorithm, so it may just be an illustrative example.

I need to read through the literature a bit more to get a better feel for how practical such a technique is for use on real-world problems, but for certain classes of applications, it could be a fit.

Jason R
  • 24,595
  • 2
  • 67
  • 74
  • 2
    So it's basically a lossy FFT? Like an MP3 encoder? – endolith May 08 '12 at 14:21
  • 3
    @endolith: I'm not sure that I would put it that way. Maybe more analogous to a pruned FFT algorithm that only calculates a subset of the outputs. The claim is that if the input signal is $k$-sparse, then the $k$ outputs are calculated exactly. – Jason R May 08 '12 at 16:22
  • I wonder how it goes up against goertzel algorithm (or a family of them). It seems like the only difference there is that in goertzel you know what you are looking for to begin with. – Spacey May 08 '12 at 16:25
  • 6
    @endolith: MP3 compression is lossy because the coefficients are quantized ; not because only the top k coefficients are kept. Sparse FFT = "what is the k-coefficients representation minimizing the difference with the input signal". Coding of a mp3 frame = "what are the quantized coefficients and quantization levels that minimize the (perceptual) error given a budget of N bits for storing the coefficients & scale factors". – pichenettes May 09 '12 at 08:51
  • @pichenettes: Oh. I thought that quiet spectral components were completely thrown away by MP3. – endolith May 09 '12 at 13:41
  • 2
    When they are thrown away, this is a side-effect of quantization (the value is rounded to 0) – pichenettes May 09 '12 at 17:08
  • Does it determine which coefficients are significant and their count $k$ during the transform process, or do you need to tell it before hand? – Benjohn Jan 26 '16 at 10:45
  • The abstract of Simple and Practical Algorithm for Sparse Fourier Transform states that the significant coefficients are discovered by the algorithm. The value of $k$ is a parameter to the algorithm. – Benjohn Jan 26 '16 at 10:57
6

I have looked through the paper and I think I got the general idea of the method. The "secret souse" of the method is how to get sparse representation of input signal in frequency domain. The previous algorithms used kind of brute force for location of dominant sparse coefficient. This method use instead technique which called "space recovery" or "compressed sensing" wiki article is here The exact method of sparse recovery used here looks similar to 'hard thresholding" - one of the dominant sparse recovery methods.

PS technique of sparse recovery/compressed sensing and connected to it L1 minimization used a lot in modern signal processing and especially in connection with Fourier transform. In fact it's a must to know for modern signal processing. But before Fourier transform was used as one of the methods for solution of sparse recovery problem. Here we see opposite - sparse recovery for Fourier transform.

Good site for overview compressed sensing: nuit-blanche.blogspot.com/

PPS answer to previous comment - if input signal is not exactly sparse it's lossy.

Feel free to correct me if I got method wrong.

mirror2image
  • 968
  • 5
  • 10
  • The FFT paper isn't compressed sensing. The idea behind compressed sensing is that you can recover sparse data from sparse random samples drawn from a different domain (eg recover sparse images from random sparse frequency data (ie MRI)). While this can reduce acquisition time, it increases computational cost. The FFT paper is different in that you have all you data in both domains and the goal is to make a computation happen quickly. – dranxo May 18 '12 at 07:43
  • You are wrong about compressed sensing. – mirror2image Jun 12 '12 at 10:30
  • 1
    Can you elaborate? – dranxo Jun 13 '12 at 08:39
  • Compressed sensing is a huge area with blurry edges, which include/connected to not only recovery per se but similar areas $L_p$ regularization, minimum complexity pursuits etc. Originally it' a sparsity constrained problem $Ax=y$, x in $R^m$, $y in $R^n$, m>>n with constraint $|x|_0 < k$, but later it become a lot more brad. Start with reading wiki – mirror2image Jul 15 '12 at 05:56
  • No. Compressed sensing means you are solving $\min |x|_1$ subject to $Ax = y$. There are many far reaching applications, but if you don't invoke the Candes-Romberg-Tao theorem at some point you are confusing people if you label your work with "compressed sensing". Here is a reference: http://www-stat.stanford.edu/~candes/papers/spm-robustcs-v05.pdf – dranxo Jul 22 '12 at 05:16
  • "compressing sensing is minimization of L1 norm." Yes, see these seminal papers: http://www.math.ucla.edu/~tao/preprints/sparse.html – dranxo Jul 25 '12 at 06:40
  • Oh, maybe I see the confusion. I am not saying that any time someone solves an l1 minimization problem then they are working in "compressed sensing". I am saying that people who label their work with compressed sensing base their results on the Candes-Romberh-Tao theorem, which involves l1 minimization. Labeling your work with "compressed sensing" and then never solving an l1 minimization problem is not properly using the theory. If you have some examples of significant papers that do otherwise I would interested to see some links. – dranxo Jul 25 '12 at 06:48
  • In any event, nowhere in that "sparse Fourier Transform" paper do they claim that they are using compressed sensing to achieve their speed up. – dranxo Jul 25 '12 at 06:53
6

I haven't read the paper on sFFT, but my feeling is that the idea of fasten the FFT behind is exploiting the prior of k-sparsity. Hence, one do not have to compute all entries of FFT coefficients, instead, only computing k of them. So that's why for k-sparse signal, the complexity is O(klog n) instead of O(nlog n) for conventional FFT.

Any way, regarding to the comments of @rcmpton, by saying "The idea behind compressed sensing is that you can recover sparse data from sparse random samples drawn from a different domain (eg recover sparse images from random sparse frequency data (ie MRI))." The question is what is "sparse random samples"? I think it might be samples collected by randomly projecting the sparse data to some lower (measurement) subspace.

And as I understood, the theoretical framework of compressive sensing is majorly comprised of 3 issues, sparsity, measurement, and recovery. By sparsity, it pertains to seeking sparse representations for certain class of signals, which is the task of dictionary learning. By measurement, it pertains to seeking a efficient way (computational efficiency and recoverable) to measure the data (or projecting data to lower measurement space), which is the task of measurement matrix design, such as random Gaussian matrix, structured random matrix, .... And by recovery, is the sparse regularized linear inversion problems, l0, l1, l1-l2, lp, l-group, blabla..., and the resulted algorithms are various, Matching pursuit, soft thresholding, hard thresholding, basis pursuit, bayesian, ....

It is true that "cs is minimization of L1 norm", and L1 norm is a basic principle for cs, but cs is not only minimization of L1 norm. Besides the above 3 parts, there are also some extensions, like structured (group, or model) compressive sensing, where structured sparsity is also exploited, and is proved to largely improve the recovery ability.

As a conclusion, cs is a big step in sampling theory, providing an efficient way to sample signals, provided that these signals are sparse enough. So, cs is a sampling theory, anyone who is going to use it as some technique for classification or recognition is misleading the principle. And occasionally, I find some paper titled with "compressive sensing based .....", and I think the principle of such paper is exploiting l1-minimization instead of cs and it's better to use "l1-minimization based ....".

If I am wrong, correct me please.

Pierre
  • 91
  • 1
  • 5
0

I've looked into Sparse Fourier Transform. I've found that it's just grouping data points before running them through a shorter FFT than what you normally would use. Unfortunately, if your data is signal + noise, this has the side-effect of raising the noise floor.

ffts_rule
  • 1
  • 1
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center. – ZaellixA Jan 08 '23 at 10:57