I wrote C++ phase vocoder to change pitch based on MATLAB code from DAFX book. MATLAB code and sample audio file are here. I tested both on a pure sine wave and MATLAB code outputs good result, but C++ adds some buzzing. I spent days trying to understand why and I would appreciate your thoughts.
Here are relevant C++ methods. They use other classes, but the code excerpts below should be clear enough. The first method pitchShift is called for every FFTWindowLength = 1024 frames read from a file. The second method synthesizeHop is called from pitchShift for every hop.
vector<double> PitchShiftResampling::pitchShift(long numSampsToProcess, AudioBuffer audioBuffer){ static double fftProcessData[MAX_FRAME_LENGTH]; long i, s, k; long gRover = PitchShiftResampling::inFifoLatency; vector<double> output(numSampsToProcess); // TODO: Make it work for stereo. For now, work with only one channel vector<double> indata = audioBuffer.getSampleData(0); FFT* fft = new FFT(fftFrameSize, audioBuffer.getNumChannels()); /* main processing loop */ for (i = 0; i < numSampsToProcess; i++){ output[i] = gOutFIFO[gRover-inFifoLatency]; /* As long as we have not yet collected enough data just read in */ gInFIFO[gRover] = indata[i]; gRover++; /* now we have enough data for processing */ if (gRover >= fftFrameSize) { gRover = inFifoLatency; /* do windowing */ for (s = 0; s < fftFrameSize; s++) { window = 0.5 * (1.0 - cos (2.0*M_PI*(double)s/(double)(fftFrameSize))); fftProcessData[s] = window * gInFIFO[s]; } /* do FFT centering */ Util::fftshift(fftProcessData, MAX_FRAME_LENGTH); /* do FFT */ fft->computeFFTFrameData(fftProcessData); synthesizeHop(fft); /* move input FIFO */ for (k = 0; k < inFifoLatency; k++) { gInFIFO[k] = gInFIFO[k+stepSize]; } } } delete fft; return output; }
void PitchShiftResampling::synthesizeHop(FFT *fft) { int k; vector<double> omega; for (k = 0; k < fftFrameSize; k++) { omega.push_back(2 * M_PI * stepSize * k / fftFrameSize); } for (k = 0; k <= fftFrameSize2; k++) { double currentAnaPhase = fft->phaseMagn->phas[0][k]; double deltaPhi = omega[k] + Util::princarg(currentAnaPhase - prevAnaPhase[k] - omega[k]); double currentSynPhase = Util::princarg(prevSynPhase[k] + deltaPhi * pitchShift); double magn = fft->phaseMagn->magn[0][k]; double real = magn * cos(currentSynPhase); double imag = magn * sin(currentSynPhase); // fftw library uses following format for spectrum data // r0, r1, r2, ...., r_(n/2), i_(n+1)/2-1, ...., i2, i1 // where (r0,0) is f_0, (r1,i1) is f_1, and so on (f_0 does not have imaginary part because the input array is real). if (k == 0) { fft->fftFrameSpectrum->currentChannelData->spectrum[k] = real; } if (k > 0) { fft->fftFrameSpectrum->currentChannelData->spectrum[k] = real; fft->fftFrameSpectrum->currentChannelData->spectrum[fftFrameSize - k] = imag; } prevAnaPhase[k] = currentAnaPhase; prevSynPhase[k] = currentSynPhase; } /* IFFT */ fft->invertFFT(1); Util::fftshift(fft->fftFrameSpectrum->currentChannelData->output, fftFrameSize); vector<double> grain; /* do windowing and add to output accumulator */ for(k=0; k < fftFrameSize; k++) { window = 0.5 * (1.0 - cos (2.0*M_PI*(double)k/(double)(fftFrameSize))); grain.push_back(window * fft->fftFrameSpectrum->currentChannelData->output[k] / fftFrameSize); } vector<double> grain2; vector<double> grain3; for(k=0; k < fftFrameSize; k++) { grain2.push_back(grain[k]); } grain2.push_back(0); for(k=0; k < lx; k++) { double a = grain2[ix[k]]; double val = a * dx1[k] + grain2[ix1[k]] * dx[k]; grain3.push_back(val); } for(k=0; k < lx; k++) { gOutputAccum[k] += grain3[k]; } for (k = 0; k < stepSize; k++) { gOutFIFO[k] = gOutputAccum[k]; } /* shift accumulator */ memmove(gOutputAccum, gOutputAccum+stepSize, fftFrameSize*2*sizeof(float)); }
I tested princarg, fftshift functions, made sure that I correctly assign values to fftw library spectrum data structure. I also made sure that overlap add is implemented correctly and that interpolation part is equivalent to MATLAB code.
Here is a screenshot of a resulting sine wave, processed in C++ using 0.8 pitch shift ratio (input was 440Hz sine wave):

There are glitches appearing in sine wave every $2.5$ cycles, even for different pitch shift ratios. Glitches become worse in C++ as the pitch ratio decreases below 1, but sine waves processed in MATLAB look almost perfect.
Any hints or ideas about where to look for error are greatly appreciated!
EDIT: Problem solved
Thanks guys! I suspected there is an issue related to the overlap and it turned out my fftshift function had a bug. Apparently I didn't test it carefully enough. Shifting was off by one which resulted in that glitch in every hop.
I replaced
std::rotate(&in[0], &in[n - n2], &in[n - 1]); with:
for (int i = 0; i < n2; i++) { double tmp = in[i]; in[i] = in[i + n2]; in[i + n2] = tmp; } where n is the length of the array and n2 = n / 2 and the buzzing is gone!