0
$\begingroup$

For simplicity and concreteness, let's assume you are taking the DFT of a signal which is a pure sinusoid but at a frequency that does not line up with a frequency bin in your DFT. I'm wondering why tapering a signal with a window function reduces leakage. Now, I'm familiar and comfortable with the explanation (which I'll rapid fire fudge through now - better explanations abound on this site and elsewhere) that different window functions have different DTFTs with different main/side lobe characteristics, and in the inevitable windowing process that happens when you take a snapshot of the infinite version of your signal, the window's DTFT is being convolved with the "original" DTFT of your signal (and then sampled from since we're calculating the finite-binned DFT not the idealistic continuous frequency DTFT), so lower side lobes mean less leakage to far away frequencies during this convolution, etc etc. And this is indeed a complete explanation!

My question is if there's another explanation from the point of view of the DFT as calculating the DFS of your signal: roughly (and anthropomorphically) speaking, it's calculating how to recreate your signal by setting magnitudes and phases on a selection of basis functions and summing them together. From this perspective, why does tapering your signal to zero (via some non-rectangular window function) mean that it is possible to recreate your signal with a tighter group of frequencies? That is, why is there significantly more magnitude in the bins closest to the true frequency and significantly less in the distant bins when compared to the DFS representation of the boxcar-windowed version of your signal which does not taper to zero?

EDIT: Everyone’s answers are great as far as they go, but they don’t address what I’m getting at so I’ll try and clarify. Again, the DFT is calculating the DFS representation of the signal, and it’s from this perspective I’d like to try to remain as I already understand the phenomena from other perspectives. Assume your signal is a pure sinusoid. If it falls directly in a bin, your DFS representation is simple - you just use the basis function in that bin, set magnitude/phase to match, and all other magnitudes in other bins to zero. If your signal is just barely off bin, then you can’t just use one basis function; the DFS representation uses mostly the one basis function in the nearest bin, but it has to use little bits of all the other basis functions to recreate your signal with the phase set so that they add constructively/destructively in the perfect way so as to perfectly line up with your signal when everything is added together. The further your signal frequency is from the nearest bin, the more of the other bins you have to use to make things work out correctly, and the more spread out your DFS magnitude distribution is.

Let’s start with a signal whose frequency is directly between bins - this is the case where you need the most magnitude in bins further away from the true frequency bin to make things work out in the DFS representation. There’s two ways you can tweak your signal in a way that makes it possible for the DFS to represent it with less magnitude in bins away from the true frequency:

  1. tweak the frequency so that it’s closer to a bin. Why does this work? Well, interpretation (which could very well be provably flawed) is that this means you can rely more on the basis function in that nearest bin to do most of the work in recreating your signal, and you only need small amounts of the other basis functions to do the final corrections.
  2. Taper the signal with a window function. Why does this work? This is the question at hand: why a signal which is a pure sinusoid but tapered by a window function can be represented with a DFS magnitude distribution that’s more tightly concentrated around the true pre-windowed sinusoidal frequency.

I’m looking for an answer similar to the one I gave for #1; for reason X you need less of those far away (in frequency) basis functions to make things work out when your signal tapers to zero and you’ve smoothed out the discontinuities at the boundary of periodic extension. Of course, it’s always possible that the answer is just that there’s no nice intuitive explanation from this perspective. But let’s see!

$\endgroup$
8
  • $\begingroup$ I’m assuming the explanation you are looking for is the edge discontinuities that are induced by the DFT assuming your signal is $N$-periodic? $\endgroup$ Commented Mar 31 at 2:55
  • $\begingroup$ There are so many things that you're getting right. I don't understand how that leads you to this question: "why does tapering your signal to zero (via some non-rectangular window function) mean that it is possible to recreate your signal with a tighter group of frequencies?" - - - - - It's sorta nonsensical. So I dunno how to answer it. What do you mean by "a tighter group of frequencies"? $\endgroup$ Commented Mar 31 at 3:51
  • $\begingroup$ @Baddioes Yep, and I'm fully comfortable and on board with the fact that the DFT assumes your signal is $N$-periodic, but I'm looking for exactly what about this periodic assumption leads to this specific behavior. I usually see an intuitive but hand-wavey argument that roughly goes "discontinuity=bad=spectral leakage", but I'm looking for a reasonably rigorous argument (but from the DFS representation perspective, rather than the other one I mention in my post) for why this discontinuity leads to the specific phenomenon of leakage, and why smoothing out the discontinuity reduces the leakage. $\endgroup$ Commented Apr 5 at 21:57
  • $\begingroup$ @robert The DFT calculates the DFS representation of your signal. If your signal is a pure sinusoid w/ freq between bins, the magnitude distribution in the DFS rep. will be more tightly concentrated around the bins closest to the true frequency when you taper with a non-rectangular window function, and more spread out when you don't. The DFS representation has to use all frequencies to recreate your signal, but it uses less (magnitude) of those far away frequencies to recreate your signal when you use a tapering window. It seems like there should be a nice intuitive explanation for this. $\endgroup$ Commented Apr 5 at 22:04
  • $\begingroup$ I guess the answer to your original question is that tapering the signal to zero prevents a discontinuity when the signal is periodically extended. That discontinuity is more high frequency in content which, when modulated up to the sinusoid's frequency (call that a "carrier frequency"), causes components that are further away from that carrier frequency. $\endgroup$ Commented Apr 5 at 23:33

4 Answers 4

2
$\begingroup$

Keep in mind that it follows directly from the definition of the DFT that the signals are periodic in the DFT length $N$ in both domains.

The time domain signal is well defined at all times and it's simply a periodic reputation of the DFT buffer. Let's look at a cosine wave of frequency 1.5 (ha;f way between bins 1 and 2) and repeat it three times.

enter image description here

We see that at the buffer boundary (t = 1000, t = 2000), the unwindowed signal has a very sharp discontinuity. The discontinuity is very short in time, which means it's spread out in frequency.

The window smooths out the discontinuity and distributes it over a much longer time span, i.e. it's more compact in frequency.

$\endgroup$
2
  • 1
    $\begingroup$ I might suggest showing this with several cycles of sinusoid between the discontinuities. It's only 1.5 cycles. It should be 10.5 or something like that. $\endgroup$ Commented Apr 1 at 2:33
  • $\begingroup$ @Hilmar I’m relatively comfortable with the whole idea of the time-frequency trade off/uncertainty principle, but it’s not clear to me precisely why this applies to something like the “length” of a discontinuity. outside from the principle just being fundamental and ubiquitous in Fourier analysis, could you go into more detail why and how it’s relevant here? $\endgroup$ Commented Apr 5 at 23:31
1
$\begingroup$

Yes, the DFT is calculating how to recreate the signal by setting magnitudes & phases of all the components. It's recreating the signal that it is given, including that hiccup in the middle where the phase of the wave undergoes a step change. To you that hiccup doesn't occur because the start of the signal and the end of the signal are disjoint -- but to the DFT, it's calculating the Fourier series of a periodic signal.

It sounds like what you want to do is to find, experimentally, the DTFT: the Fourier transform of an infinitely long discrete-time signal. To do this using the DFT you have to approximate that infinitely long discrete-time signal by only considering a finite-length segment of it (this is wise: if you wait for an infinitely long signal to finish, your grant money will run out, or your company will go under).

When you take a finite-length sample of your infinitely-long signal, you introduce artifacts that exhibit themselves as spectral bleeding in a DFT. The DFT isn't doing anything it shouldn't do, but what it's doing is taking what you give it as one period of an infinitely long periodic signal.

You window your finite-length sample as a way of improving the fidelity of your approximation, by making the resulting periodic signal with windowing every period have statistics that more closely resemble the infinite signal that you are sampling.

$\endgroup$
2
  • $\begingroup$ Could you clarify in what way the windowing makes the periodic extension of your finite sample have statistics that more closely resemble the infinite signal you’re sampling? What statistics? $\endgroup$ Commented Apr 5 at 22:26
  • $\begingroup$ My answer is pretty much an alternate view of Hilmar's. By windowing, you suppress the artifact created by chopping off the infinite signal -- this makes the spectrum that's left more closely resemble the spectrum of the infinite signal -- assuming the infinite signal is stationary, and that you've captured enough of it. $\endgroup$ Commented Apr 6 at 0:36
0
$\begingroup$

I want to start by clarifying that the DFT is just a sampling of the DTFT of the time-limited sampled signal $s_n$: \begin{equation} DFT\{s_n\}(k) = DTFT\{s_n\}(2\pi k/N), \end{equation} where $s_n$ is the sampled time-domain signal, and $N$ is the number of samples. Therefore, we must first consider the DTFT in order to get the whole picture, and then we can see how the DFT will appear after computing the DTFT and sampling it at regular intervals.

After clarifying this, it should be easier to understand that the viewpoint of the "periodized signal" and the viewpoint of the window applied to the infinitely long signal are just two ways of describing the same result. However, the difference is that the DTFT is, under the approximation of a limited bandwidth, equal to the actual Fourier transform of the continuous-time signal $s(t)$, so it can be seen as the more physically valid quantity. In fact, the core of your question is unaffected by the relationship between the sampling frequency and the frequency of the signal; in the DTFT the position of the peak doesn't matter (as far as your question is concerned).

Therefore, assuming that aliasing can be neglected (it will always be present for a time-limited signal), we can directly take your question and transpose it to the continuous-time signal $s(t)$, which we assume to be windowed from $-T/2$ to $T/2$. (Note that I'm not even making any assumption on the content of the signal, be it a simple sine or anything else.) Thus, the spectrum is equal to the spectrum of the infinitely long signal convoluted by the spectrum of the window. With this description, which to my advice is more grounded to the physical signal, you can think of the window as just another signal with its own properties.

Let's start by considering a rectangular window. It has high spectral content (side-lobes that drop as $1/f$) because it contains two infinitely steep transitions, that require high frequencies to be achieved. Thus, if you apply a smooth transition to zero by using any other window shape, you reduce by a lot the frequency content of the window signal. The fact that then the signal is more concentrated in the origin (or around the sine's frequency in your case) is because you reduce the duration of the signal, and thus the resolution. The part that goes to zero is basically not contributing to the Fourier transform, which has more "trouble" recognising the exact frequency of your signal.

As a last note, which probably answers your question more concisely than everything written above, I want to pinpoint one aspect of your question. You ask why tapering makes it so that we can reconstruct the signal with a smaller interval of frequencies. However, the spectrum you're seeing is the spectrum of the windowed signal, hence you will reconstruct the windowed signal if you use this spectrum, not the original one.

$\endgroup$
0
$\begingroup$

Here's the closest answer I got. Let's say our signal is a pure sinusoid with frequency directly between bins. To make the following analysis simpler, let's just assume this is a complex sinusoid - two of these added together will allow us to easily recover the case of the purely real sinusoid. First, let's look at the DFT output for the frequency bins on either side. We know that the magnitude will be higher in these two bins than in any other, and we'll leave it at that for now. The phase in both bins will be such that the corresponding component in the DFS will be in phase with the signal in the middle of the signal. Why? Well, the phase of a given bin is a (circular) average of phase offsets between that bin's basis function and the phase of the signal. If our signal starts at phase $\phi$, then the phase offset in the bin directly will start at phase $\phi - 0$ (since the basis function starts at zero phase) and then linearly increase/decrease (depending on if it is the bin that has a slower/faster frequency than the signal). By the end of the signal, the signal phase will have undergone an integer and a half cycles (since it's directly between bins) and so will end in phase $\phi + \pi$. Thus the bin at the faster frequency will have a phase which is the average of phase offsets which varied linearly from $\phi$ to $\phi + \pi$ whereas the bin at the slower frequency will be the average of phase offsets which varied linearly from $\phi$ to $\phi - \pi$. Of course, $\phi + \pi = \phi - \pi$ when periodicity is taken into account, but the phase offsets are moving in a different direction around the unit circle to get to this value and so their averages are different: one is $\phi + \pi/2$ while the other is $\phi - \pi/2$. Consider the bin with the faster frequency whose corresponding DFS component has phase is $\phi - \pi/2$. If you line it up against the signal, it will lag the signal by $\pi/2$ at the beginning. By the end, it will have gone an integer number of cycles (say $M$), whereas the signal will have gone $M - 1/2$ cycles (again, by virtue of the signal's frequency being directly between bins). Thus It will line up with the signal in the middle of the window. The same will be true of the slower frequency.

Now, let's see what we've got so far for our DFS representation. We have two components at some magnitude whose phases line up in the middle of the time window. Since one completes $M$ cycles and the other completes $M-1$ cycles during the time window, they complete $M/2$ and $M/2 - 1/2$ cycles during half the time window. Since they are perfectly in phase in the middle of the time window, that means they are perfectly out of phase by the end of the window! So they are interfering destructively at the ends of the time window. If we just take these two components, our partial-DFS representation will be pretty close to our signal at the middle of the window. After all, each component is not only in phase with each other, but also in phase with the original signal in the middle of the window. The faster frequency which started lagging the signal by $\pi/2$ will undergo $M/2$ cycles while the signal undergoes $(M-1/2)/2$ cycles between the beginning and middle of the window, a quarter cycle less, perfectly making up for that $\pi/2$. But by the ends of the signals our partial-DFS representation will be zero due to the destructive interference! Not even close to our original signal. We're going to need lots of magnitude in the other bins to make everything work out right.

On the other hand, if our signal actually naturally tapers to zero by virtue of being a sinusoid modulated by a window function, our work is almost done! We need a bit of magnitude in the other bins to make everything perfect, but we don't have this glaring issue of destructive interference with our two workhorse bins. The previous analysis is a little more complicated, since the circular average of phase offsets is actually a weighted circular averaged, and now the phase offsets at the edges are weighted less. But since they're weighted less symmetrically at the beginning and the end, the average phase offset will still be perfectly in the middle of its linear variation from $\phi$ to $\phi \pm \pi$ and thus will be $\phi \pm \pi/2$ as before.

There’s still a bit of handwaving, but this is otherwise the kind of explanation I was looking for in my original question. Any criticism is welcome!

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.