My initial thought was to upload audio files to YouTube along with video that is inspired from the audio. The particular visualization can be in different form such as spectrum, spectogram, or other forms of visualizations that change with the audio. I'm not familiar with all the capabilities of ffmpeg or sox, but I wonder if I can do something like this out of the box, or as a series of scripts with other command line utilities.
4 Answers
Audio visualization with ffmpeg

ffmpeg -i input.mp3 -filter_complex \ "[0:a]avectorscope=s=640x518,pad=1280:720[vs]; \ [0:a]showspectrum=mode=separate:color=intensity:scale=cbrt:s=640x518[ss]; \ [0:a]showwaves=s=1280x202:mode=line[sw]; \ [vs][ss]overlay=w[bg]; \ [bg][sw]overlay=0:H-h,drawtext=fontfile=/usr/share/fonts/TTF/Vera.ttf:fontcolor=white:x=10:y=10:text='\"Song Title\" by Artist'[out]" \ -map "[out]" -map 0:a -c:v libx264 -preset fast -crf 18 -c:a copy output.mkv ffmpeg can use several filters to visualize audio: avectorscope, showspectrum, and showwaves. You can then place them where you want with overlay, and then add text with drawtext.
In the example above the audio is stream copied (re-muxed) instead of being re-encoded.
From FFmpeg Wiki: How to Encode Videos for YouTube and other Video Sharing Sites.
- 1+1 for the link so I could search on ffmpeg showspectrum - the FFmpeg examples are too complicated for me.Sun– Sun2014-11-27 22:09:57 +00:00Commented Nov 27, 2014 at 22:09
- @sunk818 It just takes some practice. You can just copy and paste the command and it will do as shown above. You may have to adjust the
fontfileif you decide you want to add text too, or just remove the drawtext part.llogan– llogan2014-11-27 23:06:36 +00:00Commented Nov 27, 2014 at 23:06 - the fontfile gave me an error and I wasn't too interested in figuring out the syntax for WindowsSun– Sun2014-11-27 23:17:03 +00:00Commented Nov 27, 2014 at 23:17
- Great answer, worked out of the box for me on Ubuntu 22. For those curious to see what it would look like in motion, I uploaded a video of this exact output: youtu.be/iXm3CKdDnd0rlittles– rlittles2023-07-16 01:13:39 +00:00Commented Jul 16, 2023 at 1:13
Here are some examples for taking an audio file, running it through ffmpeg, and have a video created based on some of the filters available in ffmpeg.
Examples:
spectogram:
ffmpeg -i song.mp3 -filter_complex showspectrum=mode=separate:color=intensity:slide=1:scale=cbrt -y -acodec copy video.mp4 avectorscope:
ffmpeg -i song.mp3 -filter_complex avectorscope=s=320x240 -y -acodec copy video.mp4 zooming mandelbrot:
ffmpeg -i song.mp3 -f lavfi -i mandelbrot=s=320x240 -y -acodec copy video.mp4 (Screenshot missing)
- I get
Codec not supported: VLC could not decode the format " " (No description for this codec)unless I change "mp4" to "mkv". But +1 anyway because these were helpful examples.Ryan– Ryan2018-08-07 19:18:19 +00:00Commented Aug 7, 2018 at 19:18 - Another one is
showwaves:ffmpeg -i input.mp3 -filter_complex showwaves=s=1280x202:mode=line -acodec copy video.mp4Flimm– Flimm2020-10-27 12:11:25 +00:00Commented Oct 27, 2020 at 12:11 - Likely missing a stream selector for the filter input. Get "Cannot find an unused audio input stream to feed the unlabeled input pad avectorscope:default". Fixed with
-filter_complex "[0:a]avectorscope=s=320x240", same for other examples. Don't know why the selector is required since I think the automatic selection should work.mins– mins2025-11-23 17:10:43 +00:00Commented Nov 23 at 17:10
I use this:
ffmpeg -y -i audio.mp3 -loop 1 -i image.jpg -filter_complex "[1:v]crop=640:480:0:0,setsar=1[img]; [0:a]showwaves=mode=line:s=hd480:[email protected]|[email protected]:scale=sqrt,format=yuva420p[waves]; [img][waves]overlay=format=auto,drawtext=text='${NAME}':[email protected]:fontsize=30:font=Arvo:x=(w-text_w)/5:y=(h-text_h)/5[out]" -map "[out]" -map 0:a -pix_fmt yuv420p -b:a 360k -r:a 44100 -c:v libx264 -q:v 23 -preset ultrafast -c:a copy -shortest out.mkv It's a "standing wave" effect on top of an image with overlayed text (e.g. track name)
So I take a JPG image from unsplash, put in folder as "image.jpg". Then I take audio.mp3 and combine with wave effect into a 480p video. I guess you can adjust 480p to HD.
[
Spectrum of an audio stream shown as a video stream
To be typed as one line:
ffmpeg -i "a/sample-2.mp3" -filter_complex "[0:a]showspectrum=slide=scroll:mode=combined:color=channel:fscale=log:scale=sqrt:legend=1[v]" -y -acodec copy -map [v] -map 0:a "out/spectrum.mp4"
and adjust filenames.
I add the meaning of each parameter so that they can be customized as required:
-i "a/sample-2.mp3": The input file name-filter_complex: Create a filtergraph with more than one input stream and more than one output stream.[0:a]: Select audio stream(s) in input file as input(s) to the filtergraph.showspectrum: Filter used to perform Fourier transform and get the spectrum from the samples. Many of the Fourier transform parameters can be changed. Here using the defaults, in particular the window function is hann type and the overlap is 0. All parameters are described here. The other parameters below are only related to the display of the Fourier transform.slide=scroll: The spectrum is shown as a graph sliding from right to left.mode=combined: L/R channels combined into a single rowcolor=channel: Red/green colors and white for the center.fscale=log: Frequency scale is logarithmic as usual for musical scales.scale=sqrt: Scale for sound intensity (on the right) is quadratic, as usual for power (could belogfor dB).legend=1: Enable legend display (scales)[v]: Give a name to the output of the filtergraph, for later reuse-y: Overwrite output file if it exists-acodec copy: Copy audio stream(s) as-is into the output file (no processing)-map [v] -map 0:a: Add filtergraph output[v](spectrogram video) and audio stream(s) in the input file as output file content."out/spectrum.mp4": File for output. The muxer and the encoders to use are guessed from the file extension (in the present case audio streams are copied without processing, hence only video needs to be encoded)
Similar filters
Similar to showspectrum but static: showspectrumpic.
showcqt: Convert input audio to a video output representing frequency spectrum logarithmically using Brown-Puckette constant Q transform algorithm. Similar to Fourier transform. By default show a sliding spectrum with names of notes, here a Dm7 chord:
showfreqs: Convert input audio to video output representing the audio power spectrum. Audio amplitude is on Y-axis while frequency is on X-axis. Similar to showspectrum, but not sliding.
showspatial: Convert stereo input audio to a video output, representing the spatial relationship between two channels.
showvolume: Convert input audio volume to a video output. VU meter:
ahistogram: Convert input audio to a video output, displaying the volume histogram.
showwaves: Convert input audio to a video output, representing the samples waves. Similar to showvolume. Also similar but static: showwavespic:
a3dscope: Convert input audio to 3d scope video output. Similar to histogram. Also similar: abitscope. What do the 32 bits bars represent is still a mystery:
avectorscope: Convert input audio to a video output, representing the audio vector scope:







