2
$\begingroup$

I wrote C++ phase vocoder to change pitch based on MATLAB code from DAFX book. MATLAB code and sample audio file are here. I tested both on a pure sine wave and MATLAB code outputs good result, but C++ adds some buzzing. I spent days trying to understand why and I would appreciate your thoughts.

Here are relevant C++ methods. They use other classes, but the code excerpts below should be clear enough. The first method pitchShift is called for every FFTWindowLength = 1024 frames read from a file. The second method synthesizeHop is called from pitchShift for every hop.

vector<double> PitchShiftResampling::pitchShift(long numSampsToProcess, AudioBuffer audioBuffer){ static double fftProcessData[MAX_FRAME_LENGTH]; long i, s, k; long gRover = PitchShiftResampling::inFifoLatency; vector<double> output(numSampsToProcess); // TODO: Make it work for stereo. For now, work with only one channel vector<double> indata = audioBuffer.getSampleData(0); FFT* fft = new FFT(fftFrameSize, audioBuffer.getNumChannels()); /* main processing loop */ for (i = 0; i < numSampsToProcess; i++){ output[i] = gOutFIFO[gRover-inFifoLatency]; /* As long as we have not yet collected enough data just read in */ gInFIFO[gRover] = indata[i]; gRover++; /* now we have enough data for processing */ if (gRover >= fftFrameSize) { gRover = inFifoLatency; /* do windowing */ for (s = 0; s < fftFrameSize; s++) { window = 0.5 * (1.0 - cos (2.0*M_PI*(double)s/(double)(fftFrameSize))); fftProcessData[s] = window * gInFIFO[s]; } /* do FFT centering */ Util::fftshift(fftProcessData, MAX_FRAME_LENGTH); /* do FFT */ fft->computeFFTFrameData(fftProcessData); synthesizeHop(fft); /* move input FIFO */ for (k = 0; k < inFifoLatency; k++) { gInFIFO[k] = gInFIFO[k+stepSize]; } } } delete fft; return output; 

}

void PitchShiftResampling::synthesizeHop(FFT *fft) { int k; vector<double> omega; for (k = 0; k < fftFrameSize; k++) { omega.push_back(2 * M_PI * stepSize * k / fftFrameSize); } for (k = 0; k <= fftFrameSize2; k++) { double currentAnaPhase = fft->phaseMagn->phas[0][k]; double deltaPhi = omega[k] + Util::princarg(currentAnaPhase - prevAnaPhase[k] - omega[k]); double currentSynPhase = Util::princarg(prevSynPhase[k] + deltaPhi * pitchShift); double magn = fft->phaseMagn->magn[0][k]; double real = magn * cos(currentSynPhase); double imag = magn * sin(currentSynPhase); // fftw library uses following format for spectrum data // r0, r1, r2, ...., r_(n/2), i_(n+1)/2-1, ...., i2, i1 // where (r0,0) is f_0, (r1,i1) is f_1, and so on (f_0 does not have imaginary part because the input array is real). if (k == 0) { fft->fftFrameSpectrum->currentChannelData->spectrum[k] = real; } if (k > 0) { fft->fftFrameSpectrum->currentChannelData->spectrum[k] = real; fft->fftFrameSpectrum->currentChannelData->spectrum[fftFrameSize - k] = imag; } prevAnaPhase[k] = currentAnaPhase; prevSynPhase[k] = currentSynPhase; } /* IFFT */ fft->invertFFT(1); Util::fftshift(fft->fftFrameSpectrum->currentChannelData->output, fftFrameSize); vector<double> grain; /* do windowing and add to output accumulator */ for(k=0; k < fftFrameSize; k++) { window = 0.5 * (1.0 - cos (2.0*M_PI*(double)k/(double)(fftFrameSize))); grain.push_back(window * fft->fftFrameSpectrum->currentChannelData->output[k] / fftFrameSize); } vector<double> grain2; vector<double> grain3; for(k=0; k < fftFrameSize; k++) { grain2.push_back(grain[k]); } grain2.push_back(0); for(k=0; k < lx; k++) { double a = grain2[ix[k]]; double val = a * dx1[k] + grain2[ix1[k]] * dx[k]; grain3.push_back(val); } for(k=0; k < lx; k++) { gOutputAccum[k] += grain3[k]; } for (k = 0; k < stepSize; k++) { gOutFIFO[k] = gOutputAccum[k]; } /* shift accumulator */ memmove(gOutputAccum, gOutputAccum+stepSize, fftFrameSize*2*sizeof(float)); 

}

I tested princarg, fftshift functions, made sure that I correctly assign values to fftw library spectrum data structure. I also made sure that overlap add is implemented correctly and that interpolation part is equivalent to MATLAB code.

Here is a screenshot of a resulting sine wave, processed in C++ using 0.8 pitch shift ratio (input was 440Hz sine wave):

440Hz sine wave processed in C++ using 0.8 pitch shift ratio

There are glitches appearing in sine wave every $2.5$ cycles, even for different pitch shift ratios. Glitches become worse in C++ as the pitch ratio decreases below 1, but sine waves processed in MATLAB look almost perfect.

Any hints or ideas about where to look for error are greatly appreciated!

EDIT: Problem solved

Thanks guys! I suspected there is an issue related to the overlap and it turned out my fftshift function had a bug. Apparently I didn't test it carefully enough. Shifting was off by one which resulted in that glitch in every hop.

I replaced

std::rotate(&in[0], &in[n - n2], &in[n - 1]); 

with:

for (int i = 0; i < n2; i++) { double tmp = in[i]; in[i] = in[i + n2]; in[i + n2] = tmp; } 

where n is the length of the array and n2 = n / 2 and the buzzing is gone!

$\endgroup$
1
  • $\begingroup$ Hope is FFTsize/4 ? if the glitches appear always every 2.5 cycles, your problem seems to be in the overlap point position. $\endgroup$ Commented Dec 14, 2016 at 23:37

1 Answer 1

4
$\begingroup$

you might have a windowing offset error. that it's so skinny, might be an offset of one. this is sometimes called the "fence-post error" or "off-by-one" error because MATLAB and C++ might look at indexing slightly differently.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.