Revisions to STFT and DWT (Wavelets)

Added implementation advice.

edited Nov 19, 2013 at 16:17

145
3

$$\sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle \left\langle w_{x_0,k_0} \right| = \text{identity}$$$$\sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle \frac{1}{\sqrt{k_0}} \left\langle w_{x_0,k_0} \right| = \text{identity}$$

$$\text{my_filter} = \sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle f(x_0,k_0) \left\langle w_{x_0,k_0} \right|$$

Update 2013-11-19: Adding implementation details below as requested.

For some signal $f(x)$, we wish to compute coefficients:

$$c_{x_0,k_0} = \left\langle w_{x_0,k_0} | f \right\rangle$$

For a fixed $k_0$, $c{x_0,k_0}$ may be viewed as a function of $x_0$, and this function is simply a filtered version of $f$. Specifically, it is a convolution of $f$ with $w_{0,k_0}$, which we can compute efficiently using a Fourier method. Thus, we may efficiently compute all the $c_{x_0,k_0}$ as follows:

Apply a Fourier transform to $f$ to obtain $\hat f$. Probably you want to do this a window at a time, with enough overlap to be able to throw away the windowing artifacts, etc, but for simplicity let's suppose you do the whole signal at once, and that its length is a power of two.

For each $k_0$ in some geometric progression with spacing about $1/4$ the filter bandwidth (or finer if you want):
- Form the product of $\hat f$ with $\hat w_{0,k_0}$.
- Truncate the spectrum to some interval $[k_l,k_r)$ whose length is a power of two, and that contains the non-zero portion of $\hat w_{0,k_0}$.
- Apply an inverse Fourier transform to that.
- Multiply that by $exp(ix\frac{k_l+k_r}{2})$ to correct the phase. The result is $c_{x_0,k_0}$ viewed as a function of $x_0$.

This computes all the wavelet coefficients. You can choose the resolution in $k_0$ by tuning the ratio in the geometric progression. The resolution in $x_0$ is set by the length of the truncated spectrum, and will change depending on the bandwidth of $w_{0,k_0}$, which in turn depends on $k_0$. The computational effort is one Fourier transform at high time resolution, plus one inverse Fourier transform for each $k_0$ value at much lower time resolution. It works out about the same as STFT - maybe slower by some small factor that depends on the resolution you choose.

You can then modify the $c_{x_0,k_0}$ as you see fit, and you can reconstruct the signal by reversing the above process, summing the spectra over $k_0$ before finally doing an overall inverse Fourier transform.

Truncating spectra sometimes introduces normalisation problems, depending on precisely how your FFT is defined. I will not attempt to cover all the possibilities here. Normalisation is basically an easy problem. ;-)

The only part that remains is to choose a suitable wavelet envelope. It turns out it is easier to get $\hat w_{x_0,k_0}(k)$ right than to get $w_{x_0,k_0}(x)$ right. A suitable definition (from many possibilities) is:

$$\hat w_{x0,k0} = A exp(-i(k-k_0)x_0) exp(-(Q log(k/k_0))^2)$$

in which $Q$ is a dimensionless constant that selects the bandwidth of your filter, i.e. the frequency resolution of your wavelets, and $A$ is chosen as necessary for normalisation. With this definition and sufficiently high resolution for $k_0$, the overcompleteness condition holds, and the signal reconstruction will work.

$$\sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle \left\langle w_{x_0,k_0} \right| = \text{identity}$$

$$\text{my_filter} = \sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle f(x_0,k_0) \left\langle w_{x_0,k_0} \right|$$

$$\sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle \frac{1}{\sqrt{k_0}} \left\langle w_{x_0,k_0} \right| = \text{identity}$$

$$\text{my_filter} = \sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle f(x_0,k_0) \left\langle w_{x_0,k_0} \right|$$

Update 2013-11-19: Adding implementation details below as requested.

For some signal $f(x)$, we wish to compute coefficients:

$$c_{x_0,k_0} = \left\langle w_{x_0,k_0} | f \right\rangle$$

For a fixed $k_0$, $c{x_0,k_0}$ may be viewed as a function of $x_0$, and this function is simply a filtered version of $f$. Specifically, it is a convolution of $f$ with $w_{0,k_0}$, which we can compute efficiently using a Fourier method. Thus, we may efficiently compute all the $c_{x_0,k_0}$ as follows:

Apply a Fourier transform to $f$ to obtain $\hat f$. Probably you want to do this a window at a time, with enough overlap to be able to throw away the windowing artifacts, etc, but for simplicity let's suppose you do the whole signal at once, and that its length is a power of two.

For each $k_0$ in some geometric progression with spacing about $1/4$ the filter bandwidth (or finer if you want):
- Form the product of $\hat f$ with $\hat w_{0,k_0}$.
- Truncate the spectrum to some interval $[k_l,k_r)$ whose length is a power of two, and that contains the non-zero portion of $\hat w_{0,k_0}$.
- Apply an inverse Fourier transform to that.
- Multiply that by $exp(ix\frac{k_l+k_r}{2})$ to correct the phase. The result is $c_{x_0,k_0}$ viewed as a function of $x_0$.

This computes all the wavelet coefficients. You can choose the resolution in $k_0$ by tuning the ratio in the geometric progression. The resolution in $x_0$ is set by the length of the truncated spectrum, and will change depending on the bandwidth of $w_{0,k_0}$, which in turn depends on $k_0$. The computational effort is one Fourier transform at high time resolution, plus one inverse Fourier transform for each $k_0$ value at much lower time resolution. It works out about the same as STFT - maybe slower by some small factor that depends on the resolution you choose.

You can then modify the $c_{x_0,k_0}$ as you see fit, and you can reconstruct the signal by reversing the above process, summing the spectra over $k_0$ before finally doing an overall inverse Fourier transform.

Truncating spectra sometimes introduces normalisation problems, depending on precisely how your FFT is defined. I will not attempt to cover all the possibilities here. Normalisation is basically an easy problem. ;-)

The only part that remains is to choose a suitable wavelet envelope. It turns out it is easier to get $\hat w_{x_0,k_0}(k)$ right than to get $w_{x_0,k_0}(x)$ right. A suitable definition (from many possibilities) is:

$$\hat w_{x0,k0} = A exp(-i(k-k_0)x_0) exp(-(Q log(k/k_0))^2)$$

in which $Q$ is a dimensionless constant that selects the bandwidth of your filter, i.e. the frequency resolution of your wavelets, and $A$ is chosen as necessary for normalisation. With this definition and sufficiently high resolution for $k_0$, the overcompleteness condition holds, and the signal reconstruction will work.

added 2 characters in body; added 1 characters in body; added 4 characters in body; edited body

Source Link

edited Nov 18, 2013 at 23:25

apt1002

145
3

There are many ways of defining a wavelet basis. Typically a wavelet looks something like:

$$w_{x_0,k_0}(x) = A\exp(ik_0x)w(k_0(x-x_0))$$$$w_{x_0,k_0}(x) = A\exp(ik_0x)e(k_0(x-x_0))$$

In which $x_0$ is the centre in time, $k_0$ is the centre in frequency, and $w$$e$ is a window function. $A$ absorbs the phase and normalisation. The main way in which this differs from your STFT is that the width of the window depends on $k$.

Usually one uses discrete points $(x_0,k_0)$ to limit the number of wavelets. As in your STFT, it is often good to use a larger number of points than the dimensionality of the signal. The aim is to approximate the answer you'd get by using all possible $(x_0,k_0)$, but with reasonable computational resources.

Because the dimensionality of the transformed data exceeds that of the signal, the wavelet basis will not be orthonormal. I.e. the following will be false:

$$\left\langle w_{k_0, x_0} | w_{k_0',x_0'} \right\rangle = \delta(x_0,x_0')\delta(k_0,k_0')$$

However, for suitable $A$ and $w$, you can arrange for the basis to be overcomplete:

$$\sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle \left\langle w_{x_0,k_0} \right| = \mathbb1$$$$\sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle \left\langle w_{x_0,k_0} \right| = \text{identity}$$

In other words, you can reconstruct the signal perfectly just by adding up its constituent wavelets.

Your "modification" can simply be inserted in the above sum:

$$\text{my_filter} = \sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle f(x_0,k_0) \left\langle w_{x_0,k_0} \right|$$

There are many ways of defining a wavelet basis. Typically a wavelet looks something like:

$$w_{x_0,k_0}(x) = A\exp(ik_0x)w(k_0(x-x_0))$$

In which $x_0$ is the centre in time, $k_0$ is the centre in frequency, and $w$ is a window function. $A$ absorbs the phase and normalisation. The main way in which this differs from your STFT is that the width of the window depends on $k$.

Usually one uses discrete points $(x_0,k_0)$ to limit the number of wavelets. As in your STFT, it is often good to use a larger number of points than the dimensionality of the signal. The aim is to approximate the answer you'd get by using all possible $(x_0,k_0)$, but with reasonable computational resources.

Because the dimensionality of the transformed data exceeds that of the signal, the wavelet basis will not be orthonormal. I.e. the following will be false:

$$\left\langle w_{k_0, x_0} | w_{k_0',x_0'} \right\rangle = \delta(x_0,x_0')\delta(k_0,k_0')$$

However, for suitable $A$ and $w$, you can arrange for the basis to be overcomplete:

$$\sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle \left\langle w_{x_0,k_0} \right| = \mathbb1$$

In other words, you can reconstruct the signal perfectly just by adding up its constituent wavelets.

Your "modification" can simply be inserted in the above sum:

$$\text{my_filter} = \sum_{x_0,k_0} \left| w_{x_0,k_0} \right\rangle f(x_0,k_0) \left\langle w_{x_0,k_0} \right|$$

There are many ways of defining a wavelet basis. Typically a wavelet looks something like:

$$w_{x_0,k_0}(x) = A\exp(ik_0x)e(k_0(x-x_0))$$

In which $x_0$ is the centre in time, $k_0$ is the centre in frequency, and $e$ is a window function. $A$ absorbs the phase and normalisation. The main way in which this differs from your STFT is that the width of the window depends on $k$.