Aurora is the enterprise end-to-end speech solution. This Python SDK will allow you to quickly and easily use the Aurora service to integrate voice capabilities into your application.
The SDK is currently in a pre-alpha release phase. Bugs and limited functionality should be expected.
The Recommended Python version is 3.0+
The Python SDK currently does not bundle the necessary system headers and binaries to interact with audio hardware in a cross-platform manner. For this reason, before using the SDK, you need to install PortAudio:
$ brew install portaudio $ pip install auroraapi Note that you may need to use pip2 if you are using python2 or pip3 if you are using python3.
$ sudo apt-get install python-pyaudio $ pip install auroraapi This will install PyAudio as well as its dependencies (PortAudio). Use yum if your distribution uses RPM-based packages. If your distribution does not have PortAudio in its repository, install PortAudio via source and then install pyaudio using pip.
First, make sure you have an account with Aurora and have created an Application.
# Import the package import auroraapi as aurora from auroraapi.text import Text # Set your application settings aurora.config.app_id = "YOUR_APP_ID" # put your app ID here aurora.config.app_token = "YOUR_APP_TOKEN" # put your app token here # query the TTS service speech = Text("Hello world").speech() # play the resulting audio speech.audio.play() # or save it to a file speech.audio.write_to_file("test.wav")# Import the package import auroraapi as aurora from auroraapi.audio import AudioFile from auroraapi.speech import Speech # Set your application settings aurora.config.app_id = "YOUR_APP_ID" # put your app ID here aurora.config.app_token = "YOUR_APP_TOKEN" # put your app token here # open an existing WAV file (16-bit, mono, 16KHz WAV PCM) with open("test.wav", "rb") as f: a = AudioFile(f.read()) p = Speech(a).text() print(p.text) # 'hello world'from auroraapi.text import Text from auroraapi.speech import Speech # Call the TTS API to convert "Hello world" to speech speech = Text("Hello world").speech() # Previous API returned a Speech object, so we can just call # the text() method to get a prediction p = speech.text() print(p.text) # 'hello world'from auroraapi.speech import listen # Listen for 3 seconds (silence is automatically trimmed) speech = listen(length=3) # Convert to text p = speech.text() print(p.text) # prints the predictionCalling this API will start listening and will automatically stop listening after a certain amount of silence (default is 1.0 seconds).
from auroraapi.speech import listen # Start listening until 1.0s of silence speech = listen() # Or specify your own silence timeout (0.5 seconds shown here) speech = listen(silence_len=0.5) # Convert to text p = speech.text() print(p.text) # prints the predictionContinuously listen and retrieve speech segments. Note: you can do anything with these speech segments, but here we'll convert them to text. Just like the previous example, these segments are demarcated by silence (1.0 second by default) and can be changed by passing the silence_len parameter. Additionally, you can make these segments fixed length (as in the example before the previous) by setting the length parameter.
from auroraapi.speech import continuously_listen # Continuously listen and convert to speech (blocking example) for speech in continuously_listen(): p = speech.text() print(p.text) # Reduce the amount of silence in between speech segments for speech in continuously_listen(silence_len=0.5): p = speech.text() print(p.text) # Fixed-length speech segments of 3 seconds for speech in continuously_listen(length=3.0): p = speech.text() print(p.text)If you already know that you wanted the recorded speech to be converted to text, you can do it in one step, reducing the amount of code you need to write and also reducing latency. Using the listen_and_transcribe method, the audio that is recorded automatically starts uploading as soon as you call the method and transcription begins. When the audio recording ends, you get back the final transcription.
from auroraapi.speech import listen_and_transcribe, continuously_listen_and_transcribe text = listen_and_transcribe(silence_len=0.5): print("You said: {}".format(text.text)) # You can also use this in the same way as `continuously_listen` for text in continuously_listen_and_transcribe(silence_len=0.5): print("You said: {}".format(text.text))from auroraapi.speech import continuously_listen_and_transcribe for text in continuously_listen_and_transcribe(): text.speech().audio.play()The interpret service allows you to take any Aurora Text object and understand the user's intent and extract additional query information. Interpret can only be called on Text objects and return Interpret objects after completion. To convert a user's speech into and Interpret object, it must be converted to text first.
from auroraapi.text import Text # create a Text object text = Text("what is the time in los angeles") # call the interpret service. This returns an `Interpret` object. i = text.interpret() # get the user's intent print(i.intent) # time # get any additional information print(i.entities) # { "location": "los angeles" }from auroraapi.text import Text while True: # Repeatedly ask the user to enter a command user_text = raw_input("Enter a command:") if user_text == "quit": break # Interpret and print the results i = Text(user_text).interpret() print(i.intent, i.entities)This example shows how easy it is to voice-enable a smart lamp. It responds to queries in the form of "turn on the lights" or "turn off the lamp". You define what object you're listening for (so that you can ignore queries like "turn on the music").
from auroraapi.speech import continuously_listen valid_words = ["light", "lights", "lamp"] valid_entities = lambda d: "object" in d and d["object"] in valid_words for speech in continuously_listen(silence_len=0.5): i = speech.text().interpret() if i.intent == "turn_on" and valid_entities(i.entities): # do something to actually turn on the lamp print("Turning on the lamp") elif i.intent == "turn_off" and valid_entities(i.entities): # do something to actually turn off the lamp print("Turning off the lamp")You can create a high-level outline of a conversation using the Dialog Builder, available in the Aurora Dashboard. Once you have create the dialog, its ID will be available to you in the "Conversation Details" sidebar (under "Conversation ID"). You can use this ID to create and run the entire conversation in just a few lines of code:
# Import the package import auroraapi as aurora from auroraapi.dialog import Dialog # Set your application settings aurora.config.app_id = "YOUR_APP_ID" # put your app ID here aurora.config.app_token = "YOUR_APP_TOKEN" # put your app token here # Create the Dialog with the ID from the Dialog Builder dialog = Dialog("DIALOG_ID") # Run the dialog dialog.run()If you've used a UDF (User-Defined Function) in the Dialog Builder, you can write the corresponding function and register it to the dialog using the set_function function. The UDF must take one argument, which is the dialog context. You can use it to retrieve data from other steps in the dialog and set custom data for future use (both in UDFs and in the builder).
If the UDF in the dialog builder has branching enabled, then you can return True or False to control which branch is taken.
from auroraapi.dialog import Dialog def udf(context): # get data for a particular step data = context.get_step_data("step_id") # set some custom data context.set_user_data("id", "some data value") # return True to take the upward branch in the dialog builder return True dialog = Dialog("DIALOG_ID") dialog.set_function("udf_id", udf) dialog.run()Here, step_id is the ID of a step in the dialog builder and udf_id is the ID of the UDF you want to register a function for. id is an arbitrary string you can use to identify the data you are setting.
You can also set a function when creating the Dialog that lets you handle whenever the current step changes or context is changed.
from auroraapi.dialog import Dialog def handle_update(context): # this function is called whenever the current step is changed or # whenever the data in the context is updated # you can get the current dialog step like this step = context.get_current_step() print(step, context) dialog = Dialog("DIALOG_ID", on_context_update=handle_update) dialog.run()