Skip to content

technicianted/libmsspeech

Repository files navigation

libmsspeech

Library for Microsoft Cognitive Services speech recognition. For more details about usage, take a look at my blog post.

This is the very first version that works. Do not use it in any serious application yet!

Prerequisites

Building

autoreconf --force --install ./configure make 

Using

Start by running exampleProgram to learn how to use the library:

Usage: exampleProgram [OPTION...] <key> <language> -d	Produce debug output. -f FILE	Audio input file, stdin if omitted. -m MODE	Recognition mode: -p MODE	Set profanity handling mode {raw|masked|removed}. Default is masked.	{interactive|dictation|conversation}. Default is interactive. -t	Request detailed recognition output. 

To recognize a file:

exampleProgram -f <path to wav> -m interactive <your subscription key> en-us 

On Linux, you can stream audio directly from microphone using Debian alsa-utils:

arecord -c 1 -r 16000 -f S16_LE | ./exampleProgram -m interactive <your subscription key> en-us 

or perform long dictation on Steve Jobs Standford University commencement speech:

curl -L -s https://archive.org/download/SteveJobsSpeechAtStanfordUniversity/SteveJobsSpeech_64kb.mp3 | \ mpg123 -w - -m -r 16000 -e s16 - | \ ./exampleProgram -m dictation <your subscription key> en-us 

More explanation and details on how to use the library can be found in this blog post.