Desktop audio falls behind when recording microphone + desktop audio + screen using ffmpeg

Question

I have put together this script for recording the microphone, the desktop audio and the screen using ffmpeg:

DATE=`which date` RESO=2560x1440 FPS=30 PRESET=ultrafast DIRECTORY=$HOME/Video/ FILENAME=videocast`$DATE +%d%m%Y_%H.%M.%S`.mkv ffmpeg -y -vsync 1 \ -f pulse -ac 2 -i alsa_output.pci-0000_00_1b.0.analog-stereo.monitor \ -f pulse -ac 1 -ar 25000 -i alsa_input.usb-0d8c_C-Media_USB_Headphone_Set-00-Set.analog-mono \ -filter_complex aresample=async=1,amix=duration=shortest,apad \ -f x11grab -r $FPS -s $RESO -i :0.0 \ -acodec libvorbis \ -vcodec libx264 -pix_fmt yuv420p -preset $PRESET -threads 0 \ $DIRECTORY$FILENAME

Everything is recorded and between the screen and the microphone sound there are no issues what so ever, however the desktop audio falls behind badly.

It begins in sync but gets worse over time during playback, also in ffplay. It does not matter what application playing sound: both Youtube-videos in the browser, desktop sounds and Rhythmbox (playing a couple of seconds of song then stops, wait and repeat) gets out of sync.

The terminal output complain about

"ALSA lib pcm.c:7843:(snd_pcm_recover) overrun occurred22.73 bitrate=10384.5kbits/s ALSA lib pcm.c:7843:(snd_pcm_recover) underrun occurred"

and similar but I do not know what that means.

Full terminal output here:

ffmpeg version 2.0.1 Copyright (c) 2000-2013 the FFmpeg developers built on Aug 11 2013 14:52:28 with gcc 4.8.1 (GCC) 20130725 (prerelease) configuration: --prefix=/usr --disable-debug --disable-static --enable-avresample --enable-dxva2 --enable-fontconfig --enable-gpl --enable-libass --enable-libbluray --enable-libfreetype --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librtmp --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libv4l2 --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libxvid --enable-pic --enable-postproc --enable-runtime-cpudetect --enable-shared --enable-swresample --enable-vdpau --enable-version3 --enable-x11grab libavutil 52. 38.100 / 52. 38.100 libavcodec 55. 18.102 / 55. 18.102 libavformat 55. 12.100 / 55. 12.100 libavdevice 55. 3.100 / 55. 3.100 libavfilter 3. 79.101 / 3. 79.101 libavresample 1. 1. 0 / 1. 1. 0 libswscale 2. 3.100 / 2. 3.100 libswresample 0. 17.102 / 0. 17.102 libpostproc 52. 3.100 / 52. 3.100 Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, pulse, from 'alsa_output.pci-0000_00_1b.0.analog-stereo.monitor': Duration: N/A, start: 0.014093, bitrate: 1536 kb/s Stream #0:0: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s Guessed Channel Layout for Input Stream #1.0 : mono Input #1, pulse, from 'alsa_input.usb-0d8c_C-Media_USB_Headphone_Set-00-Set.analog-mono': Duration: N/A, start: 0.006172, bitrate: 400 kb/s Stream #1:0: Audio: pcm_s16le, 25000 Hz, mono, s16, 400 kb/s [x11grab @ 0x218a6e0] device: :0.0 -> display: :0.0 x: 0 y: 0 width: 2560 height: 1440 [x11grab @ 0x218a6e0] shared memory extension found Input #2, x11grab, from ':0.0': Duration: N/A, start: 1379021580.184321, bitrate: N/A Stream #2:0: Video: rawvideo (BGR[0] / 0x524742), bgr0, 2560x1440, -2147483 kb/s, 30 tbr, 1000k tbn, 30 tbc [libx264 @ 0x21ae560] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX [libx264 @ 0x21ae560] profile Constrained Baseline, level 5.0 [libx264 @ 0x21ae560] 264 - core 133 r2339 585324f - H.264/MPEG-4 AVC codec - Copyleft 2003-2013 - http://www.videolan.org/x264.html - options: cabac=0 ref=1 deblock=0:0:0 analyse=0:0 me=dia subme=0 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=0 keyint=250 keyint_min=25 scenecut=0 intra_refresh=0 rc=crf mbtree=0 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=0 Output #0, matroska, to '/home/anders/Video/videocast12092013_23.33.00.mkv': Metadata: encoder : Lavf55.12.100 Stream #0:0: Audio: vorbis (libvorbis) (oV[0][0] / 0x566F), 25000 Hz, mono, fltp Stream #0:1: Video: h264 (libx264) (H264 / 0x34363248), yuv420p, 2560x1440, q=-1--1, 1k tbn, 30 tbc Stream mapping: Stream #0:0 (pcm_s16le) -> aresample (graph 0) Stream #1:0 (pcm_s16le) -> amix:input1 (graph 0) amix (graph 0) -> Stream #0:0 (libvorbis) Stream #2:0 -> #0:1 (rawvideo -> libx264) Press [q] to stop, [?] for help ALSA lib pcm.c:7843:(snd_pcm_recover) overrun occurred22.73 bitrate=10384.5kbits/s ALSA lib pcm.c:7843:(snd_pcm_recover) underrun occurred ALSA lib pcm.c:7843:(snd_pcm_recover) underrun occurred3.22 bitrate=10423.3kbits/s ALSA lib pcm.c:7843:(snd_pcm_recover) overrun occurred25.25 bitrate=11011.0kbits/s ALSA lib pcm.c:7843:(snd_pcm_recover) underrun occurred ALSA lib pcm.c:7843:(snd_pcm_recover) underrun occurred5.76 bitrate=11013.7kbits/s ALSA lib pcm.c:7843:(snd_pcm_recover) overrun occurred27.25 bitrate=11175.4kbits/s ALSA lib pcm.c:7843:(snd_pcm_recover) underrun occurred7.76 bitrate=11168.7kbits/s ALSA lib pcm.c:7843:(snd_pcm_recover) underrun occurred8.24 bitrate=11176.4kbits/s ALSA lib pcm.c:7843:(snd_pcm_recover) overrun occurred55.48 bitrate=11243.8kbits/s ALSA lib pcm.c:7843:(snd_pcm_recover) underrun occurred ALSA lib pcm.c:7843:(snd_pcm_recover) underrun occurred frame=12871 fps= 30 q=-1.0 Lsize= 542369kB time=00:07:09.31 bitrate=10349.3kbits/s video:539762kB audio:2363kB subtitle:0 global headers:3kB muxing overhead 0.044476% [libx264 @ 0x21ae560] frame I:52 Avg QP:15.46 size:725888 [libx264 @ 0x21ae560] frame P:12819 Avg QP:18.26 size: 40172 [libx264 @ 0x21ae560] mb I I16..4: 100.0% 0.0% 0.0% [libx264 @ 0x21ae560] mb P I16..4: 2.6% 0.0% 0.0% P16..4: 18.1% 0.0% 0.0% 0.0% 0.0% skip:79.3% [libx264 @ 0x21ae560] coded y,uvDC,uvAC intra: 57.8% 49.8% 25.3% inter: 8.9% 8.7% 2.2% [libx264 @ 0x21ae560] i16 v,h,dc,p: 23% 29% 32% 16% [libx264 @ 0x21ae560] i8c dc,h,v,p: 45% 28% 18% 9% [libx264 @ 0x21ae560] kb/s:10306.26

Please help me, I am really close to get this working!

UPDATE: The desktop audio is out of sync when skipping filter_complex and microphone also, bit in a smaller amount. Using copy instead of libvorbis does not change anything either.

Buffer underrun is e.g.: sound card wanted data, but there wasn't a full buffer ready. Overrun: sound card wanted to write the captured data but the capture buffer was full. This happens when the system isn't fast enough to supply/pick up audio data and is often remedied by increasing buffer size / buffer count. — artm
– artm, Commented Oct 18, 2014 at 7:46

Blackle Mori · Accepted Answer · 2014-10-18 06:49:00Z

Not sure if this will fix it for you, but I have a script that I haven't had problems with. Comparing our two scripts, the only differences I can see are:

my filter_complex is just amerge
I force the use of 4 threads
My audio codec is mp3lame

I'm thinking the audio codec change is the most relevant difference. I think that some audio codecs get interlaced with the video somehow so they can't get out of sync. Unfortunately I'm no video engineer so I can't be so sure.

Here is my script:

#!/usr/bin/bash # video information INRES="1920x1080" OUTRES="1280x720" FPS="24" QUAL="fast" FILE_OUT="$1" #audio information PULSE_IN="alsa_input.pci-0000_00_1b.0.analog-stereo" PULSE_OUT="alsa_output.pci-0000_00_1b.0.analog-stereo.monitor" ffmpeg -f x11grab -s "$INRES" -r "$FPS" -i :0.0 \ -f pulse -i "$PULSE_IN" -f pulse -i "$PULSE_OUT" \ -filter_complex amerge \ -vcodec libx264 -crf 30 -preset "$QUAL" -s "$OUTRES" \ -acodec libmp3lame -ab 96k -ar 44100 -threads 4 -pix_fmt yuv420p \ -f flv "$FILE_OUT"

I have been testing this script instead, seems to work just fine - no delays. I too belive it has the audio codec. — madr
– madr, Commented Nov 8, 2014 at 12:52

Community · Accepted Answer · 2017-04-13 12:22:52Z

What could be happening is that the desktop sound is captured at the wrong sample rate. This may happen if the pulse audio reports sample rate incorrectly.

After having modified the config file as suggested in the linked answer (uncomment default-sample-rate setting in /etc/pulse/daemon.conf and set it to the correct value, probably 48000) you'd have to restart user's pulse audio daemon with:

pulseaudio -k pulseaudio -D

The sample rate discrepancy is obvious if you play some music via the speakers so it is captured both from the microphone and the pulse audio monitor. Not only the monitor stream would start later, but also a tenor would shift to baritone.

Stack Exchange Network

Desktop audio falls behind when recording microphone + desktop audio + screen using ffmpeg

2 Answers 2

You must log in to answer this question.

Hot Network Questions

Desktop audio falls behind when recording microphone + desktop audio + screen using ffmpeg

2 Answers 2

You must log in to answer this question.

Related

Hot Network Questions