Revision bb11e64f-df20-4c9e-9ede-cd9971ee92a3 - Signal Processing Stack Exchange

The DFT does a brute force N^2 matrix multiply.

FFTs does clever tricks, exploiting properties of the matrix (degeneralizing the matrix multiply) in order to reduce computational cost.

Let us first look at a small DFT:

W=fft(eye(4));

x = rand(4,1)+1j*rand(4,1);

X_ref = fft(x);

X = W*x;

assert(max(abs(X-X_ref)) < 1e-7)

Great so we are able to substitute MATLABs call to the FFTW library by a small 4x4 ( complex) matrix multiplication. So what does this matrix look like?

W =

 1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i

 1.0000 + 0.0000i 0.0000 - 1.0000i -1.0000 + 0.0000i 0.0000 + 1.0000i

 1.0000 + 0.0000i -1.0000 + 0.0000i 1.0000 + 0.0000i -1.0000 + 0.0000i

 1.0000 + 0.0000i 0.0000 + 1.0000i -1.0000 + 0.0000i 0.0000 - 1.0000i

Each element is either +1, -1, +1j or -1j. Obviously, this means that we can avoid full complex multiplications. Further, the first column is identical, meaning that we are multiplying the first element of x over and over by the same factor.

It turns out that Kronecker tensor products, "twiddle factors" and a permutation matrix where the index is according to the binary represantation flipped is both compact and gives insight into how FFTs are computed in practice.

C F Van Loan has a great book on this subject.