24
$\begingroup$

I have a signal that is measured at 100Hz and I need to apply the Savitzky-Golay smoothing filter on this signal. However, on closer inspection my signal is not measured at perfectly constant rate, the delta between measurements ranges between 9.7 and 10.3 ms.

Is there a way to use the Savitzky-Golay filter on not equally spaced data? Are there other methods that I could apply?

$\endgroup$
1
  • 1
    $\begingroup$ A 1991 paper by Gorry is on pretty much this exact subject datapdf.com/…. But tldr, datageist's answer is the right main idea (local least-squares). What Gorry observes is that the coefficients depend on the independent variables only and are linear in the dependent variables (like Savitzky-Golay). Then he gives a way to compute them, but if you're not writing an optimized library, any old least-squares fitter could be used. $\endgroup$ Commented Aug 24, 2019 at 7:09

6 Answers 6

5
$\begingroup$

One method would be to resample your data so that it is equally spaced, then you can do whatever processing you like. Bandlimited resampling using linear filtering isn't going to be a good option since the data isn't uniformly spaced, so you could use some sort of local polynomial interpolation (e.g. cubic splines) to estimate what the underlying signal's values are at "exact" 10-millisecond intervals.

$\endgroup$
3
  • $\begingroup$ I had this solution in mind as a last resort. I'm wondering if in the end this approach gives a better solution than just assuming that my signal is measured at a constant rate. $\endgroup$ Commented Mar 9, 2012 at 15:13
  • $\begingroup$ I think even if it's non-uniformly sampled you can still use sinc() interpolation (or a a different highly sampled low pass filter). This may give better results than a spline or a pchip $\endgroup$ Commented Mar 10, 2012 at 16:42
  • 1
    $\begingroup$ @Hilmar: You're correct. There are a number of ways you could resample the data; approximate sinc interpolation would be the "ideal" method for bandlimited resampling. $\endgroup$ Commented Mar 10, 2012 at 16:56
24
$\begingroup$

Because of the way the Savitzky-Golay filter is derived (i.e as local least-squares polynomial fits), there's a natural generalization to nonuniform sampling--it's just much more computationally expensive.

Savitzky-Golay Filters in General

For the standard filter, the idea is to fit a polynomial to a local set of samples [using least squares], then replace the center sample with the value of the polynomial at the center index (i.e. at 0). That means the standard SG filter coefficients can be generated by inverting a Vandermonde matrix of sample indicies. For example, to generate a local parabolic fit across five samples $y_0\dots y_4$ (with local indicies -2,-1,0,1,2), the system of design equations $Ac = y$ would be as follows:

$$ \begin{bmatrix}-2^0 & -2^1 & -2^2 \\ -1^0 & -1^1 & -1^2 \\ 0^0 & 0^1 & 0^2 \\ 1^0 & 1^1 & 1^2 \\ 2^0 & 2^1 & 2^2 \end{bmatrix} \begin{bmatrix} c_0 \\ c_1 \\ c_2 \end{bmatrix} = \begin{bmatrix} y_0 \\ y_1 \\ y_2 \\ y_3 \\ y_4 \end{bmatrix}. $$

In the above, the $c_0 \dots c_2$ are the unknown coefficients of the least squares polynomial $c_0 + c_1x + c_2x^2$. Since the value of the polynomial at $x = 0$ is just $c_0$, computing the pseudoinverse of the design matrix (i.e. $c = (A^TA)^{-1}A^T y\space $) will yield the SG filter coefficients in the top row. In this case, they would be

$$ \begin{bmatrix}c_0 \\ c_1 \\ c_2 \end{bmatrix} = \begin{bmatrix} -3 & 12 & 17 & 12 & -3 \\ -7 & -4 & 0 & 4 & 7 \\ 5 & -3 & -5 & -3 & 5 \\ \end{bmatrix} \begin{bmatrix} y_0 \\ y_1 \\ y_2 \\ y_3 \\ y_4 \end{bmatrix}. $$

Note that since the derivative of $c_0 + c_1x + c_2x^2$ is $c_1 + 2c_2x$, the second row of the matrix (which evaluates $c_1$) will be a smoothed derivative filter. The same argument applies for successive rows--they give smoothed higher-order derivatives. Note that I scaled the matrix by 35 so the first row would match the smoothing coefficients given on Wikipedia (above). The derivative filters each differ by other scaling factors.

Nonuniform Sampling

When the samples are evenly spaced, the filter coefficients are translation-invariant, so the result is just an FIR filter. For nonuniform samples, the coefficients will differ based on the local sample spacing, so the design matrix will need to be constructed and inverted at each sample. If the nonuniform sample times are $x_n$, and we construct local coordinates $t_n$ with each center sample time fixed at $0$, i.e.

$$ \begin{align} t_{-2} & = x_{-2} - x_0 \\ t_{-1} & = x_{-1} - x_0 \\ t_0 & = x_0 - x_0 \\ t_1 & = x_1 - x_0 \\ t_2 & = x_2 - x_0 \end{align} $$

then each design matrix will be of the following form:

$$ A = \begin{bmatrix} t_{-2}^0 & t_{-2}^1 & t_{-2}^2 \\ t_{-1}^0 & t_{-1}^1 & t_{-1}^2 \\ t_0^0 & t_0^1 & t_0^2 \\ t_1^0 & t_1^1 & t_1^2 \\ t_2^0 & t_2^1 & t_2^2 \end{bmatrix} = \begin{bmatrix} 1 & t_{-2} & t_{-2}^2 \\ 1 & t_{-1} & t_{-1}^2 \\ 1 & 0 & 0 \\ 1 & t_1 & t_1^2 \\ 1 & t_2 & t_2^2 \end{bmatrix}. $$

The first row of the pseudoinverse of $A$ dotted with the local sample values will yield $c_0$, the smoothed value at that sample.

$\endgroup$
3
  • $\begingroup$ sounds like it moves from O(log(n)) to O(n^2). $\endgroup$ Commented Feb 12, 2019 at 21:29
  • $\begingroup$ Here's an implementation of Scala described by datageist upwards. $\endgroup$ Commented May 9, 2019 at 2:12
  • 1
    $\begingroup$ @Mediumcore You didn't add a link to your original post. Also, I deleted it because it didn't provide an answer to the question. Please try to edit datageist's post to add a link; it'll be moderated in after review. $\endgroup$ Commented May 9, 2019 at 13:41
10
$\begingroup$

As techwinder did in C++, I used datageist's algorithm and implemented it in Python. Maybe this will help somebody in the future.

import numpy as np def non_uniform_savgol(x, y, window, polynom): """ Applies a Savitzky-Golay filter to y with non-uniform spacing as defined in x This is based on https://dsp.stackexchange.com/questions/1676/savitzky-golay-smoothing-filter-for-not-equally-spaced-data The borders are interpolated like scipy.signal.savgol_filter would do Parameters ---------- x : array_like List of floats representing the x values of the data y : array_like List of floats representing the y values. Must have same length as x window : int (odd) Window length of datapoints. Must be odd and smaller than x polynom : int The order of polynom used. Must be smaller than the window size Returns ------- np.array of float The smoothed y values """ if len(x) != len(y): raise ValueError('"x" and "y" must be of the same size') if len(x) < window: raise ValueError('The data size must be larger than the window size') if type(window) is not int: raise TypeError('"window" must be an integer') if window % 2 == 0: raise ValueError('The "window" must be an odd integer') if type(polynom) is not int: raise TypeError('"polynom" must be an integer') if polynom >= window: raise ValueError('"polynom" must be less than "window"') half_window = window // 2 polynom += 1 # Initialize variables A = np.empty((window, polynom)) # Matrix tA = np.empty((polynom, window)) # Transposed matrix t = np.empty(window) # Local x variables y_smoothed = np.full(len(y), np.nan) # Start smoothing for i in range(half_window, len(x) - half_window, 1): # Center a window of x values on x[i] for j in range(0, window, 1): t[j] = x[i + j - half_window] - x[i] # Create the initial matrix A and its transposed form tA for j in range(0, window, 1): r = 1.0 for k in range(0, polynom, 1): A[j, k] = r tA[k, j] = r r *= t[j] # Multiply the two matrices tAA = np.matmul(tA, A) # Invert the product of the matrices tAA = np.linalg.inv(tAA) # Calculate the pseudoinverse of the design matrix coeffs = np.matmul(tAA, tA) # Calculate c0 which is also the y value for y[i] y_smoothed[i] = 0 for j in range(0, window, 1): y_smoothed[i] += coeffs[0, j] * y[i + j - half_window] # If at the end or beginning, store all coefficients for the polynom if i == half_window: first_coeffs = np.zeros(polynom) for j in range(0, window, 1): for k in range(polynom): first_coeffs[k] += coeffs[k, j] * y[j] elif i == len(x) - half_window - 1: last_coeffs = np.zeros(polynom) for j in range(0, window, 1): for k in range(polynom): last_coeffs[k] += coeffs[k, j] * y[len(y) - window + j] # Interpolate the result at the left border for i in range(0, half_window, 1): y_smoothed[i] = 0 x_i = 1 for j in range(0, polynom, 1): y_smoothed[i] += first_coeffs[j] * x_i x_i *= x[i] - x[half_window] # Interpolate the result at the right border for i in range(len(x) - half_window, len(x), 1): y_smoothed[i] = 0 x_i = 1 for j in range(0, polynom, 1): y_smoothed[i] += last_coeffs[j] * x_i x_i *= x[i] - x[-half_window - 1] return y_smoothed 

Non-uniform Savitsky Golay filter in Python

$\endgroup$
5
$\begingroup$

"As a cheap alternative, one can simply pretend that the data points are equally spaced ...
if the change in $f$ across the full width of the $N$ point window is less than $\sqrt{N/2}$ times the measurement noise on a single point, then the cheap method can be used."
$\qquad -$ Numerical Recipes pp. 771-772

(derivation anyone ?)

("Pretend equally spaced" means:
take the nearest $\pm N/2$ points around each $t$ where you want SavGol($t$),
not snap all $t_i \to i$ . That may be obvious, but got me for a while.)

$\endgroup$
2
$\begingroup$

I found out, that there are two ways to use the savitzky-golay algorithm in Matlab. Once as a filter, and once as a smoothing function, but basically they should do the same.

  1. yy = sgolayfilt(y,k,f): Here, the values y=y(x) are assumed to be equally spaced in x.
  2. yy = smooth(x,y,span,'sgolay',degree): Here you can have x as an extra input and referring to the Matlab help x does not have to be equally spaced!
$\endgroup$
2
$\begingroup$

If it's of any help, I've made a C implementation of the method described by datageist. Free to use at your own risk.

/** * @brief smooth_nonuniform * Implements the method described in https://dsp.stackexchange.com/questions/1676/savitzky-golay-smoothing-filter-for-not-equally-spaced-data * free to use at the user's risk * @param n the half size of the smoothing sample, e.g. n=2 for smoothing over 5 points * @param the degree of the local polynomial fit, e.g. deg=2 for a parabolic fit */ bool smooth_nonuniform(uint deg, uint n, std::vector<double>const &x, std::vector<double> const &y, std::vector<double>&ysm) { if(x.size()!=y.size()) return false; // don't even try if(x.size()<=2*n) return false; // not enough data to start the smoothing process // if(2*n+1<=deg+1) return false; // need at least deg+1 points to make the polynomial int m = 2*n+1; // the size of the filter window int o = deg+1; // the smoothing order std::vector<double> A(m*o); memset(A.data(), 0, m*o*sizeof(double)); std::vector<double> tA(m*o); memset(tA.data(), 0, m*o*sizeof(double)); std::vector<double> tAA(o*o); memset(tAA.data(), 0, o*o*sizeof(double)); std::vector<double> t(m); memset(t.data(), 0, m* sizeof(double)); std::vector<double> c(o); memset(c.data(), 0, o* sizeof(double)); // do not smooth start and end data int sz = y.size(); ysm.resize(sz); memset(ysm.data(), 0,sz*sizeof(double)); for(uint i=0; i<n; i++) { ysm[i]=y[i]; ysm[sz-i-1] = y[sz-i-1]; } // start smoothing for(uint i=n; i<x.size()-n; i++) { // make A and tA for(int j=0; j<m; j++) { t[j] = x[i+j-n] - x[i]; } for(int j=0; j<m; j++) { double r = 1.0; for(int k=0; k<o; k++) { A[j*o+k] = r; tA[k*m+j] = r; r *= t[j]; } } // make tA.A matMult(tA.data(), A.data(), tAA.data(), o, m, o); // make (tA.A)-¹ in place if (o==3) { if(!invert33(tAA.data())) return false; } else if(o==4) { if(!invert44(tAA.data())) return false; } else { if(!inverseMatrixLapack(o, tAA.data())) return false; } // make (tA.A)-¹.tA matMult(tAA.data(), tA.data(), A.data(), o, o, m); // re-uses memory allocated for matrix A // compute the polynomial's value at the center of the sample ysm[i] = 0.0; for(int j=0; j<m; j++) { ysm[i] += A[j]*y[i+j-n]; } } std::cout << " x y y_smoothed" << std::endl; for(uint i=0; i<x.size(); i++) std::cout << " " << x[i] << " " << y[i] << " "<< ysm[i] << std::endl; return true; } 

smoothing

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.