0

I am trying to concatenate mp3 files using ffmpeg with not much luck. I have 100 very short 0.05-2 second mp3 samples (pronouncing letters) which I want to combine into one file which I can then jump to specific locations on to play specific sounds, rather than having individual files for each sample. When I concatenate following those instructions, it appears there are slight gaps placed in between the sounds, as the further I go down the track, the further off it appears.

For example, I have taken each individual mp3 file and calculated its duration using this. From that I can create offsets to start playing, and offsets to stop playing, each sample in the overall single mp3 file. But when I do that, the further tracks are off by more and more, telling me a gap is placed in between each track. How can I get this process to be accurate? That is, how can I concatenate 100s of mp3s without any gap, so I can calculate exactly where in the track it is I should start and stop playing?

For reference, I tried audio-joiner.com and there is even more of a gap placed between tracks.

Lance
  • 377

1 Answers1

3

It is not trivial to concatenate MP3 without a gap, and it is impossible without reencoding.

The reason behind this is, that like most audio codecs with lossy compression an MP3 contains presamples and a setup gap - basically the first Milliseconds can not contain audible content.

Speech synthesis from prerecorder fragments usually uses either PCM or formats specially crafted to avoid the above problems. Since MP3 compression is very fast, I recommend you keep your fragments in raw PCM format - this way you can combine them by just concatenating them on a file or stream basis and convert the combined result to whatever format you need.

EDIT

As requested in the comments, here is a short how-to (assuming all WAV files have 48K sample rate 16bit signed, stereo, if not just adapt accordingly):

Convert WAV to raw PCM: ffmpeg -i input.wav -c:a copy -vn -dn -sn -f s16le output.pcm

Concatenate files (Linux shown, adapt if using Windows): cat first.pcm second.pcm third.pcm > temp.pcm

Convert result to MP3 (e.g. 192K, the codec name might vary with your build): ffmpeg -f s16le -ac 2 -ar 48000 -i temp.pcm -c:a mp3 -b:a 192K output.mp3

Eugen Rieck
  • 20,271