Accuracy of MS System.Speech.Recognizer and the SpeechRecognitionEngine

Question

I am currently testing the SpeechRecognitionEngine by loading from an xml file a pretty simple rule. In fact it is a simple between ("decrypt the email", "remove encryption") or ("encrypt the email", "add encryption").

I have trained my Windows 7 PC and additionally added the words encrypt and decrypt as I realize they are very similar. The recognizer already has a problem with making a difference between these two.

The issue I am having is that it recognizes things too often. I have set the confidence to 0.93 because with my voice in a quiet room when saying the exact words sometimes only gets to 0.93. But then if I turn on the radio the voice of the announcer or a song can mean that this recognizer thinks it has heard with over 0.93 confidence with words "decrpyt the email".

Maybe Lady Gaga is backmasking Applause to secretly decrypt emails :-)

Can anyone help in working out how to do something to make this recognizer workable.

In fact the recognizer is also picking up keyboard noise as "decrypt the email". I don't understand how this is possible.

Further to my editing buddy there are at least two managed namespaces for MS Speech Microsoft.Speech and System.Speech - It is important for this question that it be know that it is System.Speech.

This is all rather normal. You didn't say anything about the microphone you used, it can be critical — Hans Passant
– Hans Passant, Commented Sep 16, 2013 at 12:14
I am using the mic from the Polycom cx100 polycom.com/products-services/products-for-microsoft/…. I trained the desktop engine and also did dictation on notepad of the words and my accuracy improved, but now it recognizes text when I am just typing. — darbid
– darbid, Commented Sep 16, 2013 at 14:37
Switch to a headset microphone. Speakerphones are notorious for picking up extraneous noise. — Eric Brown
– Eric Brown, Commented Sep 17, 2013 at 5:31
ok noted. This is a cool device but I realize that whilst hands free is good for talking on the phone or communicator it might not be so good for speech recognition. — darbid
– darbid, Commented Sep 17, 2013 at 12:32
@darbid - One of the fun things about SR is that engine confidence != accuracy. I.e., the engine can be very confident about a reco, but it will still be wrong. Conversely, the engine can have very low confidence in a reco, and it will still be correct. In practice, I never use the confidence values (aside from it being high enough to pass the rejection threshold). — Eric Brown
– Eric Brown, Commented Sep 17, 2013 at 15:39

Eric Brown · Accepted Answer · 2013-09-17 05:30:45Z

If the only thing the System.Speech recognizer is listening for is "encrypt the email", then the recognizer will generate lots of false positives. (Particularly in a noisy environment.) If you add a DictationGrammar (particularly a pronunciation grammar) in parallel, the DictationGrammar will pick up the noise, and you can check the (e.g.) name of the grammar in the event handler to discard the bogus recognitions.

A (subset) example:

 static void Main(string[] args) { Choices gb = new Choices(); gb.Add("encrypt the document"); gb.Add("decrypt the document"); Grammar commands = new Grammar(gb); commands.Name = "commands"; DictationGrammar dg = new DictationGrammar("grammar:dictation#pronunciation"); dg.Name = "Random"; using (SpeechRecognitionEngine recoEngine = new SpeechRecognitionEngine(new CultureInfo("en-US"))) { recoEngine.SetInputToDefaultAudioDevice(); recoEngine.LoadGrammar(commands); recoEngine.LoadGrammar(dg); recoEngine.RecognizeCompleted += recoEngine_RecognizeCompleted; recoEngine.RecognizeAsync(); System.Console.ReadKey(true); recoEngine.RecognizeAsyncStop(); } } static void recoEngine_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e) { if (e.Result.Grammar.Name != "Random") { System.Console.WriteLine(e.Result.Text); } }

Thank you so much for the suggestion, I am going to try this is out and then get back to you, but it sounds like a great idea. I am using an XML file and a rule with many more words or phrases, "encrypt the Document" was just one. But I still think your suggestion will work.
Yes that vastly improves things. The recognition is now not guessing. To anyone coming here because of a similar situation I think this is a must to get things working or in my case working/recognizing less. Thank you very much. I have come accross your blog and will ask a new question on one of your articles.
I am using System.Speech.Recognition and followed your suggestion to reduce the false positives, it works great. However, now I am experiencing a TargetInvocationException after like 20 minutes of recognition. I want to try Microsoft.Speech.Recognition but there is no DictationGrammar class. Is there an equivalent to DictationGrammar within Microsoft.Speech.Recognition?
@DiegoSahagun No. Microsoft.Speech.Recognition uses a different SR engine that does not support dictation.
Thanks @EricBrown, do you know if there is a way to reduce Microsoft.Speech.Recognition's false positives? maybe I should create a question for my problem with the desktop version, I haven't been able to find anythig related yet.

Collectives™ on Stack Overflow

Accuracy of MS System.Speech.Recognizer and the SpeechRecognitionEngine

1 Answer 1

6 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Linked

Related