3

I am using FFmpeg to stream from a webcam and a pulseaudio source to an RTMP server.

I know that argument order has an effect in FFmpeg.

But I have found that if I specify the audio input stream before the video input stream then the audio is delayed, about half a second behind the video.

Since these are just two input streams combined together for the output, why does the order have an effect?

I have stripped down and tested the below commands in order to simplify this post, in fact I am using hardware acceleration, AAC and various other codec options, the effect of the input ordering is always the same.

FFmpeg command specifying video input first (no delay):

ffmpeg -f v4l2 -input_format mjpeg -framerate 30 -video_size 1280x720 -i /dev/video1 -f pulse -i default -c:v libx264 -preset veryfast -f flv rtmp://a.rtmp.youtube.com/live2/${STREAM_KEY} 

FFmpeg command specifying audio input first (audio 0.5 seconds behind video):

ffmpeg -f pulse -i default -f v4l2 -input_format mjpeg -framerate 30 -video_size 1280x720 -i /dev/video1 -c:v libx264 -preset veryfast -f flv rtmp://a.rtmp.youtube.com/live2/${STREAM_KEY} 

The stdout messages from FFmpeg seem to be the same, except the stream order.

Output when video input is first:

Input #0, video4linux2,v4l2, from '/dev/video1': Duration: N/A, start: 331644.817465, bitrate: N/A Stream #0:0: Video: mjpeg (Baseline), yuvj422p(pc, bt470bg/unknown/unknown), 1280x720, 30 fps, 30 tbr, 1000k tbn, 1000k tbc Guessed Channel Layout for Input Stream #1.0 : stereo Input #1, pulse, from 'default': Duration: N/A, start: 1596371796.728130, bitrate: 1536 kb/s Stream #1:0: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s Stream mapping: Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264)) Stream #1:0 -> #0:1 (pcm_s16le (native) -> mp3 (libmp3lame)) 

Output when audio input is first:

Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, pulse, from 'default': Duration: N/A, start: 1596371788.496242, bitrate: 1536 kb/s Stream #0:0: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s Input #1, video4linux2,v4l2, from '/dev/video1': Duration: N/A, start: 331637.326454, bitrate: N/A Stream #1:0: Video: mjpeg (Baseline), yuvj422p(pc, bt470bg/unknown/unknown), 1280x720, 30 fps, 30 tbr, 1000k tbn, 1000k tbc Stream mapping: Stream #1:0 -> #0:0 (mjpeg (native) -> h264 (libx264)) Stream #0:0 -> #0:1 (pcm_s16le (native) -> mp3 (libmp3lame)) 

As you can see, the stream mapping is correct in each case.

What's going on? Any insights appreciated.

FFmpeg is version n4.3.1 compiled from git, on Ubuntu 20.04.

2
  • 1
    Very clear question. I recommend you take this to one of the ffmpeg mailing lists. Commented Aug 2, 2020 at 16:08
  • Thanks for the recommendation, I have done so: ffmpeg-archive.org/… Commented Aug 7, 2020 at 13:55

1 Answer 1

1

I just ran into this - couldn't figure out why my video was always 3s behind, no matter what options I used. Then I noticed the start times:

Input #0, x11grab, from ':10.0': Duration: N/A, start: 1627448930.994615, bitrate: N/A Stream #0:0: Video: rawvideo (BGR[0] / 0x524742), bgr0, 1920x1080, 25 fps, 25.08 tbr, 1000k tbn, 1000k tbc Input #1, pulse, from 'grab.monitor': Duration: N/A, start: 1627448933.416430, bitrate: 1536 kb/s Stream #1:0: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 

and once I switched:

Input #0, pulse, from 'grab.monitor': Duration: N/A, start: 1627449055.508778, bitrate: 1536 kb/s Stream #0:0: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s Input #1, x11grab, from ':10.0': Duration: N/A, start: 1627449055.550277, bitrate: N/A Stream #1:0: Video: rawvideo (BGR[0] / 0x524742), bgr0, 1920x1080, 25 fps, 24.92 tbr, 1000k tbn, 1000k tbc 

no more issues. I think if you're grabbing streams that don't have timestamps, ffmpeg will grab input 0's start time, initialize input 0, grab input 1's start time, initialize input 1, and then in my case was delaying my video 3 seconds to try to align them. I would just keep whichever one inits faster first.

1
  • This is very interesting, but in my case the start times of the two streams appear to be completely unrelated to each other. I'm not sure why. Commented Apr 18, 2022 at 13:32

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.