Skip to main content
added 221 characters in body
Source Link
Knut Inge
  • 3.7k
  • 1
  • 10
  • 14

The DFT does a brute force N^2 matrix multiply.

FFTs does clever tricks, exploiting properties of the matrix (degeneralizing the matrix multiply) in order to reduce computational cost.

Let us first look at a small DFT:

W=fft(eye(4));

x = rand(4,1)+1j*rand(4,1);

X_ref = fft(x);

X = W*x;

assert(max(abs(X-X_ref)) < 1e-7)

Great so we are able to substitute MATLABs call to the FFTW library by a small 4x4 ( complex) matrix multiplication by filling a matrix from the FFT function. So what does this matrix look like?

N=4,

Wn=exp(-1j2pi/N),

f=((0:N-1)'*(0:N-1))

f =

 0 0 0 0 0 1 2 3 0 2 4 6 0 3 6 9 

W=Wn.^f

W =

1 1 1 1

1 -i -1 i

1 -1 1 -1

1 i -1 -i

Each element is either +1, -1, +1j or -1j. Obviously, this means that we can avoid full complex multiplications. Further, the first column is identical, meaning that we are multiplying the first element of x over and over by the same factor.

It turns out that Kronecker tensor products, "twiddle factors" and a permutation matrix where the index is changed according to the binary represantation flipped is both compact and gives an alternate perspective on how FFTs are computed as a set of sparse matrix operations.

The lines below is a simple Decimation in Frequency (DIF) radix 2 forward FFT. While the steps may seem cumbersome, it is convenient to reuse for forward/inverse FFT, radix4/split-radix or decimation-in-time, while being a fair representation of how in-place FFTs tends to be implemented in the real world, I believe.

N = 4;

x = randn(N, 1) +1j*randn(N, 1);

T1 = exp(-1j2pi*([zeros(1, N/2), 0:(N/2-1)]).'/N),

M0 =kron(eye(2), fft(eye(2))),

M1 = kron(fft(eye(2)), eye(2)),

X=bitrevorder(x.'M1diag(T1)*M0),

X_ref=fft(x)

assert(max(abs(X(:)-X_ref(:)))<1e-6)

C F Van Loan has a great book on this subject.

The DFT does a brute force N^2 matrix multiply.

FFTs does clever tricks, exploiting properties of the matrix (degeneralizing the matrix multiply) in order to reduce computational cost.

Let us first look at a small DFT:

W=fft(eye(4));

x = rand(4,1)+1j*rand(4,1);

X_ref = fft(x);

X = W*x;

assert(max(abs(X-X_ref)) < 1e-7)

Great so we are able to substitute MATLABs call to the FFTW library by a small 4x4 ( complex) matrix multiplication. So what does this matrix look like?

W =

1 1 1 1

1 -i -1 i

1 -1 1 -1

1 i -1 -i

Each element is either +1, -1, +1j or -1j. Obviously, this means that we can avoid full complex multiplications. Further, the first column is identical, meaning that we are multiplying the first element of x over and over by the same factor.

It turns out that Kronecker tensor products, "twiddle factors" and a permutation matrix where the index is changed according to the binary represantation flipped is both compact and gives an alternate perspective on how FFTs are computed as a set of sparse matrix operations.

The lines below is a simple Decimation in Frequency (DIF) radix 2 forward FFT. While the steps may seem cumbersome, it is convenient to reuse for forward/inverse FFT, radix4/split-radix or decimation-in-time, while being a fair representation of how in-place FFTs tends to be implemented in the real world, I believe.

N = 4;

x = randn(N, 1) +1j*randn(N, 1);

T1 = exp(-1j2pi*([zeros(1, N/2), 0:(N/2-1)]).'/N),

M0 =kron(eye(2), fft(eye(2))),

M1 = kron(fft(eye(2)), eye(2)),

X=bitrevorder(x.'M1diag(T1)*M0),

X_ref=fft(x)

assert(max(abs(X(:)-X_ref(:)))<1e-6)

C F Van Loan has a great book on this subject.

The DFT does a brute force N^2 matrix multiply.

FFTs does clever tricks, exploiting properties of the matrix (degeneralizing the matrix multiply) in order to reduce computational cost.

Let us first look at a small DFT:

W=fft(eye(4));

x = rand(4,1)+1j*rand(4,1);

X_ref = fft(x);

X = W*x;

assert(max(abs(X-X_ref)) < 1e-7)

Great so we are able to substitute MATLABs call to the FFTW library by a small 4x4 ( complex) matrix multiplication by filling a matrix from the FFT function. So what does this matrix look like?

N=4,

Wn=exp(-1j2pi/N),

f=((0:N-1)'*(0:N-1))

f =

 0 0 0 0 0 1 2 3 0 2 4 6 0 3 6 9 

W=Wn.^f

W =

1 1 1 1

1 -i -1 i

1 -1 1 -1

1 i -1 -i

Each element is either +1, -1, +1j or -1j. Obviously, this means that we can avoid full complex multiplications. Further, the first column is identical, meaning that we are multiplying the first element of x over and over by the same factor.

It turns out that Kronecker tensor products, "twiddle factors" and a permutation matrix where the index is changed according to the binary represantation flipped is both compact and gives an alternate perspective on how FFTs are computed as a set of sparse matrix operations.

The lines below is a simple Decimation in Frequency (DIF) radix 2 forward FFT. While the steps may seem cumbersome, it is convenient to reuse for forward/inverse FFT, radix4/split-radix or decimation-in-time, while being a fair representation of how in-place FFTs tends to be implemented in the real world, I believe.

N = 4;

x = randn(N, 1) +1j*randn(N, 1);

T1 = exp(-1j2pi*([zeros(1, N/2), 0:(N/2-1)]).'/N),

M0 =kron(eye(2), fft(eye(2))),

M1 = kron(fft(eye(2)), eye(2)),

X=bitrevorder(x.'M1diag(T1)*M0),

X_ref=fft(x)

assert(max(abs(X(:)-X_ref(:)))<1e-6)

C F Van Loan has a great book on this subject.

deleted 237 characters in body
Source Link
Knut Inge
  • 3.7k
  • 1
  • 10
  • 14

The DFT does a brute force N^2 matrix multiply.

FFTs does clever tricks, exploiting properties of the matrix (degeneralizing the matrix multiply) in order to reduce computational cost.

Let us first look at a small DFT:

W=fft(eye(4));

x = rand(4,1)+1j*rand(4,1);

X_ref = fft(x);

X = W*x;

assert(max(abs(X-X_ref)) < 1e-7)

Great so we are able to substitute MATLABs call to the FFTW library by a small 4x4 ( complex) matrix multiplication. So what does this matrix look like?

W =

1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i

1.0000 + 0.0000i 0.0000   - 1.0000ii -1.0000 + 0.0000i 0.0000 + 1.0000i i

1.0000 + 0.0000i -1.0000 + 0.0000i 1.0000 + 0.0000i -1.0000 + 0.0000i

1.0000 + 0.0000i 0.0000 + 1.0000i i -1.0000 + 0.0000i 0.0000   - 1.0000ii

Each element is either +1, -1, +1j or -1j. Obviously, this means that we can avoid full complex multiplications. Further, the first column is identical, meaning that we are multiplying the first element of x over and over by the same factor.

It turns out that Kronecker tensor products, "twiddle factors" and a permutation matrix where the index is changed according to the binary represantation flipped is both compact and gives an alternate perspective on how FFTs are computed as a set of sparse matrix operations.

The lines below is a simple Decimation in Frequency (DIF) radix 2 forward FFT. While the steps may seem cumbersome, it is convenient to reuse for forward/inverse FFT, radix4/split-radix or decimation-in-time, while being a fair representation of how in-place FFTs tends to be implemented in the real world, I believe.

N = 4;

x = randn(N, 1) +1j*randn(N, 1);

T1 = exp(-1j2pi*([zeros(1, N/2), 0:(N/2-1)]).'/N),

M0 =kron(eye(2), fft(eye(2))),

M1 = kron(fft(eye(2)), eye(2)),

X=bitrevorder(x.'M1diag(T1)*M0),

X_ref=fft(x)

assert(max(abs(X(:)-X_ref(:)))<1e-6)

C F Van Loan has a great book on this subject.

The DFT does a brute force N^2 matrix multiply.

FFTs does clever tricks, exploiting properties of the matrix (degeneralizing the matrix multiply) in order to reduce computational cost.

Let us first look at a small DFT:

W=fft(eye(4));

x = rand(4,1)+1j*rand(4,1);

X_ref = fft(x);

X = W*x;

assert(max(abs(X-X_ref)) < 1e-7)

Great so we are able to substitute MATLABs call to the FFTW library by a small 4x4 ( complex) matrix multiplication. So what does this matrix look like?

W =

1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i

1.0000 + 0.0000i 0.0000 - 1.0000i -1.0000 + 0.0000i 0.0000 + 1.0000i

1.0000 + 0.0000i -1.0000 + 0.0000i 1.0000 + 0.0000i -1.0000 + 0.0000i

1.0000 + 0.0000i 0.0000 + 1.0000i -1.0000 + 0.0000i 0.0000 - 1.0000i

Each element is either +1, -1, +1j or -1j. Obviously, this means that we can avoid full complex multiplications. Further, the first column is identical, meaning that we are multiplying the first element of x over and over by the same factor.

It turns out that Kronecker tensor products, "twiddle factors" and a permutation matrix where the index is changed according to the binary represantation flipped is both compact and gives an alternate perspective on how FFTs are computed as a set of sparse matrix operations.

The lines below is a simple Decimation in Frequency (DIF) radix 2 forward FFT. While the steps may seem cumbersome, it is convenient to reuse for forward/inverse FFT, radix4/split-radix or decimation-in-time, while being a fair representation of how in-place FFTs tends to be implemented in the real world, I believe.

N = 4;

x = randn(N, 1) +1j*randn(N, 1);

T1 = exp(-1j2pi*([zeros(1, N/2), 0:(N/2-1)]).'/N),

M0 =kron(eye(2), fft(eye(2))),

M1 = kron(fft(eye(2)), eye(2)),

X=bitrevorder(x.'M1diag(T1)*M0),

X_ref=fft(x)

assert(max(abs(X(:)-X_ref(:)))<1e-6)

C F Van Loan has a great book on this subject.

The DFT does a brute force N^2 matrix multiply.

FFTs does clever tricks, exploiting properties of the matrix (degeneralizing the matrix multiply) in order to reduce computational cost.

Let us first look at a small DFT:

W=fft(eye(4));

x = rand(4,1)+1j*rand(4,1);

X_ref = fft(x);

X = W*x;

assert(max(abs(X-X_ref)) < 1e-7)

Great so we are able to substitute MATLABs call to the FFTW library by a small 4x4 ( complex) matrix multiplication. So what does this matrix look like?

W =

1 1 1 1

1  -i -1 i

1 -1 1 -1

1 i -1  -i

Each element is either +1, -1, +1j or -1j. Obviously, this means that we can avoid full complex multiplications. Further, the first column is identical, meaning that we are multiplying the first element of x over and over by the same factor.

It turns out that Kronecker tensor products, "twiddle factors" and a permutation matrix where the index is changed according to the binary represantation flipped is both compact and gives an alternate perspective on how FFTs are computed as a set of sparse matrix operations.

The lines below is a simple Decimation in Frequency (DIF) radix 2 forward FFT. While the steps may seem cumbersome, it is convenient to reuse for forward/inverse FFT, radix4/split-radix or decimation-in-time, while being a fair representation of how in-place FFTs tends to be implemented in the real world, I believe.

N = 4;

x = randn(N, 1) +1j*randn(N, 1);

T1 = exp(-1j2pi*([zeros(1, N/2), 0:(N/2-1)]).'/N),

M0 =kron(eye(2), fft(eye(2))),

M1 = kron(fft(eye(2)), eye(2)),

X=bitrevorder(x.'M1diag(T1)*M0),

X_ref=fft(x)

assert(max(abs(X(:)-X_ref(:)))<1e-6)

C F Van Loan has a great book on this subject.

added 163 characters in body
Source Link
Knut Inge
  • 3.7k
  • 1
  • 10
  • 14

The DFT does a brute force N^2 matrix multiply.

FFTs does clever tricks, exploiting properties of the matrix (degeneralizing the matrix multiply) in order to reduce computational cost.

Let us first look at a small DFT:

W=fft(eye(4));

x = rand(4,1)+1j*rand(4,1);

X_ref = fft(x);

X = W*x;

assert(max(abs(X-X_ref)) < 1e-7)

Great so we are able to substitute MATLABs call to the FFTW library by a small 4x4 ( complex) matrix multiplication. So what does this matrix look like?

W =

1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i

1.0000 + 0.0000i 0.0000 - 1.0000i -1.0000 + 0.0000i 0.0000 + 1.0000i

1.0000 + 0.0000i -1.0000 + 0.0000i 1.0000 + 0.0000i -1.0000 + 0.0000i

1.0000 + 0.0000i 0.0000 + 1.0000i -1.0000 + 0.0000i 0.0000 - 1.0000i

Each element is either +1, -1, +1j or -1j. Obviously, this means that we can avoid full complex multiplications. Further, the first column is identical, meaning that we are multiplying the first element of x over and over by the same factor.

It turns out that Kronecker tensor products, "twiddle factors" and a permutation matrix where the index is changed according to the binary represantation flipped is both compact and gives an alternate perspective on how FFTs are computed as a set of sparse matrix operations.

I believe that something along theseThe lines should workbelow is a simple Decimation in Frequency (DIF) radix 2 forward FFT. While the steps may seem cumbersome, although I was unable to get it quite right using only my cellphone: Nis convenient to reuse for forward/inverse FFT, radix4/split-radix or decimation-in-time, while being a fair representation of how in-place FFTs tends to be implemented in the real world, I believe.

N = 4;

x = randn(N, 1) +1j*randn(N, 1);

T0T1 = repmat(expexp(-1j2pi*(0:N/2-1)/[zeros(N1, N/2)), 1, 2);

T1 = exp(-1j2pi*(00:(N/2-1)]).'/N),

M0 =kron(eye(2), fft(eye(2))),

M1 = kron(fft(eye(2)), eye(2)),

P=bitrevorder(1:N),

X=(M0*diagX=bitrevorder(T0)x.'M1diag(T1)*diag(P)*x*M0),

X_ref=fft(x)

assert(max(abs(X(:)-X_ref(:)))<1e-6)

C F Van Loan has a great book on this subject.

The DFT does a brute force N^2 matrix multiply.

FFTs does clever tricks, exploiting properties of the matrix (degeneralizing the matrix multiply) in order to reduce computational cost.

Let us first look at a small DFT:

W=fft(eye(4));

x = rand(4,1)+1j*rand(4,1);

X_ref = fft(x);

X = W*x;

assert(max(abs(X-X_ref)) < 1e-7)

Great so we are able to substitute MATLABs call to the FFTW library by a small 4x4 ( complex) matrix multiplication. So what does this matrix look like?

W =

1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i

1.0000 + 0.0000i 0.0000 - 1.0000i -1.0000 + 0.0000i 0.0000 + 1.0000i

1.0000 + 0.0000i -1.0000 + 0.0000i 1.0000 + 0.0000i -1.0000 + 0.0000i

1.0000 + 0.0000i 0.0000 + 1.0000i -1.0000 + 0.0000i 0.0000 - 1.0000i

Each element is either +1, -1, +1j or -1j. Obviously, this means that we can avoid full complex multiplications. Further, the first column is identical, meaning that we are multiplying the first element of x over and over by the same factor.

It turns out that Kronecker tensor products, "twiddle factors" and a permutation matrix where the index is changed according to the binary represantation flipped is both compact and gives an alternate perspective on how FFTs are computed as a set of sparse matrix operations.

I believe that something along these lines should work, although I was unable to get it quite right using only my cellphone: N = 4;

x = randn(N, 1) +1j*randn(N, 1);

T0 = repmat(exp(-1j2pi*(0:N/2-1)/(N/2)), 1, 2);

T1 = exp(-1j2pi*(0:N-1).'/N),

M0 =kron(eye(2), fft(eye(2))),

M1 = kron(fft(eye(2)), eye(2)),

P=bitrevorder(1:N),

X=(M0*diag(T0)M1diag(T1)*diag(P)*x),

X_ref=fft(x)

C F Van Loan has a great book on this subject.

The DFT does a brute force N^2 matrix multiply.

FFTs does clever tricks, exploiting properties of the matrix (degeneralizing the matrix multiply) in order to reduce computational cost.

Let us first look at a small DFT:

W=fft(eye(4));

x = rand(4,1)+1j*rand(4,1);

X_ref = fft(x);

X = W*x;

assert(max(abs(X-X_ref)) < 1e-7)

Great so we are able to substitute MATLABs call to the FFTW library by a small 4x4 ( complex) matrix multiplication. So what does this matrix look like?

W =

1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i 1.0000 + 0.0000i

1.0000 + 0.0000i 0.0000 - 1.0000i -1.0000 + 0.0000i 0.0000 + 1.0000i

1.0000 + 0.0000i -1.0000 + 0.0000i 1.0000 + 0.0000i -1.0000 + 0.0000i

1.0000 + 0.0000i 0.0000 + 1.0000i -1.0000 + 0.0000i 0.0000 - 1.0000i

Each element is either +1, -1, +1j or -1j. Obviously, this means that we can avoid full complex multiplications. Further, the first column is identical, meaning that we are multiplying the first element of x over and over by the same factor.

It turns out that Kronecker tensor products, "twiddle factors" and a permutation matrix where the index is changed according to the binary represantation flipped is both compact and gives an alternate perspective on how FFTs are computed as a set of sparse matrix operations.

The lines below is a simple Decimation in Frequency (DIF) radix 2 forward FFT. While the steps may seem cumbersome, it is convenient to reuse for forward/inverse FFT, radix4/split-radix or decimation-in-time, while being a fair representation of how in-place FFTs tends to be implemented in the real world, I believe.

N = 4;

x = randn(N, 1) +1j*randn(N, 1);

T1 = exp(-1j2pi*([zeros(1, N/2), 0:(N/2-1)]).'/N),

M0 =kron(eye(2), fft(eye(2))),

M1 = kron(fft(eye(2)), eye(2)),

X=bitrevorder(x.'M1diag(T1)*M0),

X_ref=fft(x)

assert(max(abs(X(:)-X_ref(:)))<1e-6)

C F Van Loan has a great book on this subject.

added 465 characters in body
Source Link
Knut Inge
  • 3.7k
  • 1
  • 10
  • 14
Loading
added 14 characters in body
Source Link
Knut Inge
  • 3.7k
  • 1
  • 10
  • 14
Loading
Source Link
Knut Inge
  • 3.7k
  • 1
  • 10
  • 14
Loading