Questions tagged [speech-processing]

Question 1

So I have been trying to implement the complex cepstrum from scratch as part of learning in Speech signal processing course. The hidden catch is the removal of linear phase term, which my professor ...

Question 2

Consider a 1st order high-pass filter like this: $$y[n] = x[n] - \alpha x[n-1]$$ I found that in Praat's manual the relationship of its cutoff frequency $f_c$ and $\alpha$ is illustrated as: $$\alpha=...

Question 3

I am implementing my HMM-GMM speech recognition model. Right now I am facing a problem described below. Given phone-level HMMs A and B, build word-level HMM C. In this questions lets assume that ...

Question 4

Using Praat to extract the bandwidth of formants, I noticed there is no option to extract the bandwidth of the pitch. Since the F0 values do not match the pitch values, I cannot apply the same method ...

Question 5

I am reading Jurafsky and Martin's Speech and Language Processing Chapter 28 on Phonetics (pages 15,16) and they introduce waveforms and spectrum. What I don't understand is how they came from a ...

Question 6

I have data on which I have performed Voice Activity Detection (VAD) and this returns a file containing columns of data in the following order : Segment Id, Audio file name, Start time, End time. For ...

Question 7

Context: $\bar{\Theta}$ is the room regression filter coefficients (RRC); $$X_{t} = \bar{\Theta}^{H}\bar{X}_{t-1} + s_{t}$$ means in words: the filter that defines how the room causes reverberation to ...

Question 8

I'm studying the perception of vowel formants (resonances of the vocal tract) and need to create stimuli where the signal below the first (lowest) formant is removed. I have some synthesised vowels ...

Question 9

I am going through Fundamentals of Speech Recognition (Rabiner). I stumble upon the concept of Two Level Dynamic Programming . Can you suggest me any online resources to study the same?

Question 10

"The digitised speech signal $s(n)$ is put through a low order digital system (usually first order FIR filter) to spectrally flatten the signal and make it less susceptible to finite precision ...

Question 11

The Dynamic time warping is applied for time normalization. As shown in the diagram, two different signals with $Tx$ and $Ty$ time instants, are time-normalized to have $T$ time instants. $\phi$ is ...

Question 12

I am studying HMM from "Fundamentals of Speech Recognition" by Rabiner. Regarding the problem of how to adjust the parameters of a HMM, the proposed method was Baum Welch method (Expectation-...

Question 13

I want to ask you a question about the waveform synthesis or more spesifically speech synthesis. Most of the state-of-the art papers use mel-spectrograms as their inputs, because it mimics the human ...

Question 14

In "Mel spectrogram" or "Mel filterbanks", what does Mel mean and why is it capitalized ? It doesn't seem to be the name of a person.

Question 15

After I use deep learning algorithm to enhance the speech, the speech will still have a weak background noise.The background noise of this audio has little effect on the calculation of SNR, but it ...

Stack Exchange Network

Questions tagged [speech-processing]

Implementing computation of complex cepstrum from scratch

Cutoff frequency of 1st-order high-pass pre-emphasis filter?

Speech recognition. Building word-level HMM from phone-level HMMs. Transtion matrix

How to Extract Pitch Bandwidth?

Interpreting spectrum from waveforms from simple and complex examples

How do I extract a part of an audio clip whose start and end times are given into a .wav file?

Deriving the posterior distribution parameters of a normal distribution in the context of dereverberation

A filter to remove f0 and lower harmonics from the signal

Two level Dynamic Programming

How does Pre-emphasis mitigate finite precision effects?

Constraints in Dynamic Time Warping for Speech

Understanding Baum's auxiliary function used in Hidden Markov Model

Advantage and Disadvantage using Mel Spectrograms over STFT in speech/waveform synthesis

In "Mel spectrogram" or "Mel filterbanks", what does Mel mean and why is it often capitalized? [closed]

How do I eliminate background noise？

Hot Network Questions