11

I've been following this answer to use ffmpeg to convert and play some of my Audible audio-books in LinuxMint. Each book is a single source-file, but I've noticed that ffmpeg lists all the chapters at the start of conversion.

Is there a way to get ffmpeg to split the the book into chapters - convert each chapter into separate files (split by chapters)? Preferably by ffmpeg alone, but using other programs/scripts (together with ffmpeg) is also an option...

(I've seen a few other answers about splitting DVDs into chunks of even lengths or into chapters (using ffmpeg and a python-script), but that's not quite what I'm after, so I'm hoping it was a simpler way of doing it...)

5 Answers 5

12

I've been doing exactly this myself recently: as Nemo commented above - ffprobe gives you a json file with the chapter start and ends easily using the command...

ffprobe -i fileName -print_format json -show_chapters

If you add -sexagesimal to the command it creates a slightly more human readable output IMO and the output can be redirected to a file for later processing.

FFmpeg needs a little help so I also used jg and AtomicParsley - the former to parse the JSON file, the latter to add images and metadata to the resultant m4b file.

The script also supports outputting with an m4a file, or conversion to mp3 as required - simple call it with the parameters $1 - input file and (optionally) $2 output type - defaults to m4b.

Using that as a basis I created the following script...

#!/bin/bash # script to convert m4b (audiobook) files with embedded chapted (for eg. converted from Audbile) into individual chapter files # required: ffmpeg; jg (json interpreter) & AtomicParsley (to embed pictures and add additional metadata to m4a/m4b AAC files) # discover the file type (extension) of the input file ext=${1##*.} echo "extension: $ext" # all files / folders are named based on the "shortname" of the input file shortname=$(basename "$1" ".$ext") picture=$shortname.jpg chapterdata=$shortname.dat metadata=$shortname.tmp echo "shortname: $shortname" # if an output type has been given on the command line, set parameters (used in ffmpeg command later) if [[ $2 = "mp3" ]]; then outputtype="mp3" codec="libmp3lame" elif [[ $2 = "m4a" ]]; then outputtype="m4a" codec="copy" else outputtype="m4b" codec="copy" fi echo "outputtype: |$outputtype|" # if it doesn't already exist, create a json file containing the chapter breaks (you can edit this file if you want chapters to be named rather than simply "Chapter 1", etc that Audible use) [ ! -e "$chapterdata" ] && ffprobe -loglevel error \ -i "$1" -print_format json -show_chapters -loglevel error -sexagesimal \ >"$chapterdata" read -p "Now edit the file $chapterdata if required. Press ENTER to continue." # comment out above if you don't want the script to pause! # read the chapters into arrays for later processing readarray -t id <<< $(jq -r '.chapters[].id' "$chapterdata") readarray -t start <<< $(jq -r '.chapters[].start_time' "$chapterdata") readarray -t end <<< $(jq -r '.chapters[].end_time' "$chapterdata") readarray -t title <<< $(jq -r '.chapters[].tags.title' "$chapterdata") # create a ffmpeg metadata file to extract addition metadata lost in splitting files - deleted afterwards ffmpeg -loglevel error -i "$1" -f ffmetadata "$metadata" artist_sort=$(sed 's/.*=\(.*\)/\1/' <<<$(cat "$metadata" |grep -m 1 ^sort_artist)) album_sort=$(sed 's/.*=\(.*\)/\1/' <<<$(cat "$metadata" |grep -m 1 ^sort_album)) rm "$metadata" # create directory for the output mkdir -p "$shortname" echo -e "\fID\tStart Time\tEnd Time\tTitle\t\tFilename" for i in ${!id[@]}; do let trackno=$i+1 # set the name for output - currently in format <bookname>/<tranck number> outname="$shortname/$(printf "%02d" $trackno). $shortname - ${title[$i]}.$outputtype" #outname=$(sed -e 's/[^A-Za-z0-9._- ]/_/g' <<< $outname) outname=$(sed 's/:/_/g' <<< $outname) echo -e "${id[$i]}\t${start[$i]}\t${end[$i]}\t${title[$i]}\n\t\t$(basename "$outname")" ffmpeg -loglevel error -i "$1" -vn -c $codec \ -ss ${start[$i]} -to ${end[$i]} \ -metadata title="${title[$i]}" \ -metadata track=$trackno \ -map_metadata 0 -id3v2_version 3 \ "$outname" [[ $outputtype == m4* ]] && AtomicParsley "$outname" \ --artwork "$picture" --overWrite \ --sortOrder artist "$artist_sort" \ --sortOrder album "$album_sort" \ > /dev/null done 

If desired you can edit the JSON file (.dat file) as Audible files just name the chapters "Chapter 1", "Chapter 2", etc.

for eg. initially the first part of the file might read...

{ "chapters": [ { "id": 0, "time_base": "1/1000", "start": 0, "start_time": "0:00:00.000000", "end": 3206908, "end_time": "0:53:26.908000", "tags": { "title": "Chapter 1" } }, 

By simply changing the relevant line to... "title": "Introduction" will change the resultant split file.

6

You can use this minimal approach, it depends on ffmpeg (along ffprobe) as well as jq:

#!/bin/bash # Description: Split an # Requires: ffmpeg, jq # Author: Hasan Arous # License: MIT in="$1" out="$2" splits="" while read start end title; do splits="$splits -c copy -ss $start -to $end $out/$title.m4b" done <<<$(ffprobe -i "$in" -print_format json -show_chapters \ | jq -r '.chapters[] | .start_time + " " + .end_time + " " + (.tags.title | sub(" "; "_"))') ffmpeg -i "$in" $splits 

https://gist.github.com/aularon/c48173f8246fa57e9c1ef7ff694ab06f

3

Note that Aularon's answer works fine, but there is a bug in it, where the script will fail if chapter titles contain a space. An additional line with sed fixes this:

#!/bin/bash # Description: Split an # Requires: ffmpeg, jq # Author: Hasan Arous # License: MIT # https://unix.stackexchange.com/questions/499179/using-ffmpeg-to-split-an-audible-audio-book-into-chapters in="$1" out="$2" splits="" while read start end title; do newTitle=$(echo "$title" | sed "s/ /_/g") splits="$splits -c copy -ss $start -to $end $out/$newTitle.m4b" done <<<$(ffprobe -i "$in" -print_format json -show_chapters \ | jq -r '.chapters[] | .start_time + " " + .end_time + " " + (.tags.title | sub(" "; "_"))') ffmpeg -i "$in" $splits 
2

You can use ffprobe to get the chapter start and end times with the command...

ffprobe -i fileName -print_format json -show_chapters 

You can then use ffmpeg to split at the start and end times...

ffmpeg -i fileName -ss start -to end outFile 

Be sure not to use "-t"; that needs a duration to convert. "-ss" and "-to" are time positions in the file.

You'll have to script it to have it done automagically.

0

I got a little bug. If I try to convert an MP3 audiobook (speech.mp3) with this script without parameters, I get a lot of empty m4b files (one for each chapter, and the size of each chapter is 0).

I insert a few changes:

#!/bin/bash # script to convert m4b (audiobook) files with embedded chapted (for eg. converted from Audbile) into individual chapter files # required: ffmpeg; jg (json interpreter) & AtomicParsley (to embed pictures and add additional metadata to m4a/m4b AAC files) # discover the file type (extension) of the input file ext=${1##*.} echo "extension: $ext" # all files / folders are named based on the "shortname" of the input file shortname=$(basename "$1" ".$ext") picture=$shortname.jpg chapterdata=$shortname.dat metadata=$shortname.tmp echo "shortname: $shortname" extension="${1##*.}" forcemp3=0 if [ "$extension" == "mp3" ]; then forcemp3=1 fi # if an output type has been given on the command line, set parameters (used in ffmpeg command later) if [[ $2 = "mp3" || $forcemp3 = 1 ]] ; then outputtype="mp3" codec="libmp3lame" echo mp3 elif [[ $2 = "m4a" ]]; then outputtype="m4a" codec="copy" else outputtype="m4b" codec="copy" fi echo "outputtype: |$outputtype|" # if it doesn't already exist, create a json file containing the chapter breaks (you can edit this file if you want chapters to be named rather than simply "Chapter 1", etc that Audible use) [ ! -e "$chapterdata" ] && ffprobe -loglevel error \ -i "$1" -print_format json -show_chapters -loglevel error -sexagesimal \ >"$chapterdata" read -p "Now edit the file $chapterdata if required. Press ENTER to continue." # comment out above if you don't want the script to pause! # read the chapters into arrays for later processing readarray -t id <<< $(jq -r '.chapters[].id' "$chapterdata") readarray -t start <<< $(jq -r '.chapters[].start_time' "$chapterdata") readarray -t end <<< $(jq -r '.chapters[].end_time' "$chapterdata") readarray -t title <<< $(jq -r '.chapters[].tags.title' "$chapterdata") # create a ffmpeg metadata file to extract addition metadata lost in splitting files - deleted afterwards ffmpeg -loglevel error -i "$1" -f ffmetadata "$metadata" artist_sort=$(sed 's/.*=\(.*\)/\1/' <<<$(cat "$metadata" |grep -m 1 ^sort_artist)) album_sort=$(sed 's/.*=\(.*\)/\1/' <<<$(cat "$metadata" |grep -m 1 ^sort_album)) rm "$metadata" # create directory for the output mkdir -p "$shortname" echo -e "\fID\tStart Time\tEnd Time\tTitle\t\tFilename" for i in ${!id[@]}; do let trackno=$i+1 # set the name for output - currently in format <bookname>/<tranck number> outname="$shortname/$(printf "%02d" $trackno). $shortname - ${title[$i]}.$outputtype" #outname=$(sed -e 's/[^A-Za-z0-9._- ]/_/g' <<< $outname) outname=$(sed 's/:/_/g' <<< $outname) echo -e "${id[$i]}\t${start[$i]}\t${end[$i]}\t${title[$i]}\n\t\t$(basename "$outname")" ffmpeg -loglevel error -i "$1" -vn -c $codec \ -ss ${start[$i]} -to ${end[$i]} \ -metadata title="${title[$i]}" \ -metadata track=$trackno \ -map_metadata 0 -id3v2_version 3 \ "$outname" [[ $outputtype == m4* ]] && AtomicParsley "$outname" \ --artwork "$picture" --overWrite \ --sortOrder artist "$artist_sort" \ --sortOrder album "$album_sort" \ > /dev/null done 
1
  • 1
    As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center. Commented Dec 24, 2022 at 20:51

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.