0
$\begingroup$

I want to mix loopback stream with audio from microphone in real-time and use the result of mixing for speech recognition.

The project is for Windows, so I use WASAPI and am willing to know what is the best option in my case? Now I have two streams that I should feed to the mixer and send result to the ASR model.

The streams are both in the same format. How is this real-time syncing done in the professional audio software internally? Appreciate the answers!

$\endgroup$
3
  • 1
    $\begingroup$ Hi and welcome. Can you elaborate what exactly needs to be synced and why? If this is about clock management, the window's kernel mixer can handle that. $\endgroup$ Commented Jul 19, 2024 at 13:17
  • $\begingroup$ Hi, thanks! I want to mix two streams in real-time: one is from microphone (IAudioCaptureClient) and another is from the specific process (ActivateAudioInterfaceAsync). I need to do that, because they are not aligned (data from one stream is coming faster, than from the other). Also I want to know what "pattern" is used in this situations, I mean I have two circular buffers for each stream and two threads are running (mic capturing and loopback capturing), how should I trigger the third thread to mix those streams, so there is no tangible delay. What is the common approach in general? $\endgroup$ Commented Jul 19, 2024 at 18:08
  • $\begingroup$ Sorry, I still don't get it. A block diagram with signal flow and clock domains would help here. Are you saying the speech is both on the microphone and loopback channel and needs to be time aligned ? $\endgroup$ Commented Jul 20, 2024 at 12:27

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.