I want to mix loopback stream with audio from microphone in real-time and use the result of mixing for speech recognition.
The project is for Windows, so I use WASAPI and am willing to know what is the best option in my case? Now I have two streams that I should feed to the mixer and send result to the ASR model.
The streams are both in the same format. How is this real-time syncing done in the professional audio software internally? Appreciate the answers!