Library for Microsoft Cognitive Services speech recognition. For more details about usage, take a look at my blog post.
This is the very first version that works. Do not use it in any serious application yet!
- libwebsockets at least v2.1-stable, v2.2-stable.
- json-c.
- libuuid
autoreconf --force --install ./configure make Start by running exampleProgram to learn how to use the library:
Usage: exampleProgram [OPTION...] <key> <language> -d Produce debug output. -f FILE Audio input file, stdin if omitted. -m MODE Recognition mode: -p MODE Set profanity handling mode {raw|masked|removed}. Default is masked. {interactive|dictation|conversation}. Default is interactive. -t Request detailed recognition output. To recognize a file:
exampleProgram -f <path to wav> -m interactive <your subscription key> en-us On Linux, you can stream audio directly from microphone using Debian alsa-utils:
arecord -c 1 -r 16000 -f S16_LE | ./exampleProgram -m interactive <your subscription key> en-us or perform long dictation on Steve Jobs Standford University commencement speech:
curl -L -s https://archive.org/download/SteveJobsSpeechAtStanfordUniversity/SteveJobsSpeech_64kb.mp3 | \ mpg123 -w - -m -r 16000 -e s16 - | \ ./exampleProgram -m dictation <your subscription key> en-us More explanation and details on how to use the library can be found in this blog post.