Skip to main content

Questions tagged [speech-to-text]

4 votes
2 answers
82 views

I'm training a speech recognition model using the Nvidia Nemo framework. Just results with the small fastconformer model and two dozen iterations are pretty good; for my data I would say they are ...
comodoro's user avatar
  • 143
1 vote
1 answer
2k views

I want to use OpenAI's Whisper to transcribe some speech files in English. I only care about minimize the word error rate. How do medium.en, ...
Franck Dernoncourt's user avatar
0 votes
1 answer
158 views

I want to try deepspeech model. I founded only english pre-trained model Are there any other pre-trained not english model of ...
user3668129's user avatar
1 vote
0 answers
38 views

I am attempting to analyze transcribed text from an audio file to group bullet points based on known key phrases in the text. Example: I have verbally stated the following keywords in the text, which ...
Ryan Watts's user avatar
1 vote
1 answer
140 views

I've explored text-to-speech evaluation matrices and they seem to used Mean Opinion Score (MOS) to evaluate a particular model. This matrice required humans to help to judge the model based on a scale ...
Nontawat Wutticome's user avatar
1 vote
0 answers
60 views

I have a personal dataset of 10000 audio files, each consisting a single spoken sentence. These files each have the transcribed text labels with them that I can use for supervised HMM training. Now ...
Zander's user avatar
  • 11
2 votes
2 answers
414 views

I am dealing with a data set of transcribed call center data, where customers are being recorded when interacting with the agent. This is then automatically transcribed by an external transcription ...
miri_h_ds's user avatar
2 votes
1 answer
174 views

I am curious how it is done as I am interested in doing something similar. I have some manually transcribed data that contains tags for multiple speakers. I want to compare how well the out of the box ...
Samarth's user avatar
  • 359

15 30 50 per page