How to programmatically train the SpeechRecognitionEngine and convert audio file to text in C# or vb.net

Question

Is it possible to programmatically train the recognizer giving .wavs instead of talking to a microphone?

If so, How to do it?, currently I have the code that performs recognition on the audio in a 0.wav file and writes the recognized text to the console.

Imports System.IO Imports System.Speech.Recognition Imports System.Speech.AudioFormat Namespace SampleRecognition Class Program Shared completed As Boolean Public Shared Sub Main(ByVal args As String()) Using recognizer As New SpeechRecognitionEngine() Dim dictation As Grammar = New DictationGrammar() dictation.Name = "Dictation Grammar" recognizer.LoadGrammar(dictation) ' Configure the input to the recognizer. recognizer.SetInputToWaveFile("C:\Users\ME\v02\0.wav") ' Attach event handlers for the results of recognition. AddHandler recognizer.SpeechRecognized, AddressOf recognizer_SpeechRecognized AddHandler recognizer.RecognizeCompleted, AddressOf recognizer_RecognizeCompleted ' Perform recognition on the entire file. Console.WriteLine("Starting asynchronous recognition...") completed = False recognizer.RecognizeAsync() ' Keep the console window open. While Not completed Console.ReadLine() End While Console.WriteLine("Done.") End Using Console.WriteLine() Console.WriteLine("Press any key to exit...") Console.ReadKey() End Sub ' Handle the SpeechRecognized event. Private Shared Sub recognizer_SpeechRecognized(ByVal sender As Object, ByVal e As SpeechRecognizedEventArgs) If e.Result IsNot Nothing AndAlso e.Result.Text IsNot Nothing Then Console.WriteLine(" Recognized text = {0}", e.Result.Text) Else Console.WriteLine(" Recognized text not available.") End If End Sub ' Handle the RecognizeCompleted event. Private Shared Sub recognizer_RecognizeCompleted(ByVal sender As Object, ByVal e As RecognizeCompletedEventArgs) If e.[Error] IsNot Nothing Then Console.WriteLine(" Error encountered, {0}: {1}", e.[Error].[GetType]().Name, e.[Error].Message) End If If e.Cancelled Then Console.WriteLine(" Operation cancelled.") End If If e.InputStreamEnded Then Console.WriteLine(" End of stream encountered.") End If completed = True End Sub End Class End Namespace

EDIT

I understand using the Training wizard is useful to do this

accomplished by Opening Speech Recognition,clicking Start button->Control Panel->Ease of Access->Speech Recognition

.

How to custom train the speech recognition with custom wav or even mp3 files?

When using Training wizard (Control Panel training UI) training files are stored in {AppData}\Local\Microsoft\Speech\Files\TrainingAudio.

How can I use or make a custom training instead of using Training wizard?

The Speech Control Panel creates registry entries for the training audio files in the key HKCU\Software\Microsoft\Speech\RecoProfiles\Tokens{ProfileGUID}{00000000-0000-0000-0000-0000000000000000}\Files

Do the registry entries created by code have to be placed in there?

The reason to do this is I want to custom train with my own wav files and list of words and phrases, then transfer all to other systems.

Community · Accepted Answer · 2017-05-23 12:08:31Z

It's certainly possible to train SAPI using C#. you can use the speechlib wrappers around SAPI to access the training mode APIs from C#.here @Eric Brown answered the procedure

Create an inproc recognizer & bind the appropriate audio input.
Ensure that you’re retaining the audio for your recognitions; you’ll need it later.
Create a grammar containing the text to train.
Set the grammar’s state to pause the recognizer when a recognition occurs. (This helps with training from an audio file, as well.)

When a recognition occurs:
Get the recognized text and the retained audio.
Create a stream object using CoCreateInstance(CLSID_SpStream).
Create a training audio file using ISpRecognizer::GetObjectToken , and ISpObjectToken::GetStorageFileName , and bind it to the stream (using ISpStream::BindToFile ).
Copy the retained audio into the stream object.
QI the stream object for the ISpTranscript interface, and use ISpTranscript::AppendTranscript to add the recognized text to the stream.
Update the grammar for the next utterance, resume the recognizer, and repeat until you’re out of training text.

Other option could be training the sapi once with desired output, then get profiles with code and transport that to other systems, the following code Returns An ISpeechObjectTokens object.:

The GetProfiles method returns a selection of the available user speech profiles. Profiles are stored in the speech configuration database as a series of tokens, with each token representing one profile. GetProfiles retrieves all available profile tokens. The returned list is an ISpeechObjectTokens object. Additional or more detailed information about the tokens is available in methods associated with ISpeechObjectTokens. The token search may be further refined using the RequiredAttributes and OptionalAttributes search attributes. Only tokens matching the specified RequiredAttributes search attributes are returned. Of those tokens matching the RequiredAttributes key, OptionalAttributes lists devices in the order matching OptionalAttributes. If no search attributes are offered, all tokens are returned. If no audio devices match the criteria, GetAudioInputs returns an empty selection, that is, an ISpeechObjectTokens collection with an ISpeechObjectTokens::Count property of zero. See Object Tokens and Registry Settings White Paper for a list of SAPI 5-defined attributes.

Public SharedRecognizer As SpSharedRecognizer Public theRecognizers As ISpeechObjectTokens Private Sub Command1_Click() On Error GoTo EH Dim currentProfile As SpObjectToken Dim i As Integer Dim T As String Dim TokenObject As ISpeechObjectToken Set currentProfile = SharedRecognizer.Profile For i = 0 To theRecognizers.Count - 1 Set TokenObject = theRecognizers.Item(i) If tokenObject.Id <> currentProfile.Id Then Set SharedRecognizer.Profile = TokenObject T = "New Profile installed: " T = T & SharedRecognizer.Profile.GetDescription Exit For Else T = "No new profile has been installed." End If Next i MsgBox T, vbInformation EH: If Err.Number Then ShowErrMsg End Sub Private Sub Form_Load() On Error GoTo EH Const NL = vbNewLine Dim i, idPosition As Long Dim T As String Dim TokenObject As SpObjectToken Set SharedRecognizer = CreateObject("SAPI.SpSharedRecognizer") Set theRecognizers = SharedRecognizer.GetProfiles For i = 0 To theRecognizers.Count - 1 Set TokenObject = theRecognizers.Item(i) T = T & TokenObject.GetDescription & "--" & NL & NL idPosition = InStrRev(TokenObject.Id, "\") T = T & Mid(TokenObject.Id, idPosition + 1) & NL Next i MsgBox T, vbInformation EH: If Err.Number Then ShowErrMsg End Sub Private Sub ShowErrMsg() ' Declare identifiers: Dim T As String T = "Desc: " & Err.Description & vbNewLine T = T & "Err #: " & Err.Number MsgBox T, vbExclamation, "Run-Time Error" End End Sub

Community · Accepted Answer · 2017-05-23 11:51:41Z

You can generate custom training using SAPI engine (not the managed api)

Here's a link on how to do it (though a bit vague)

Collectives™ on Stack Overflow

How to programmatically train the SpeechRecognitionEngine and convert audio file to text in C# or vb.net

EDIT

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

EDIT

2 Answers 2

Comments

Comments

Related