Fixing out-of-sync audio and video with ffmpeg

I recent used gtk-recordmydesktop to record a short video to explain to one of our clients how to do something complicated in the Drupal CMS. ("A picture's worth a 1000 words...", and all that.)

Unfortunately when I viewed it, I saw that the video and audio were out of sync. About 4 seconds - just enough to be really annoying and make it impossible to follow along.

The file I recorded is "out.ogg" and "ffmpeg -i out.ogg" shows that it's "off"

Input #0, ogg, from 'out.ogg':
Duration: 00:04:47.3, start: 4.266667, bitrate: 725 kb/s
Stream #0.0: Video: theora, yuv420p, 960x576, 15.00 fps(r)
Stream #0.1: Audio: vorbis, 16000 Hz, mono, 99 kb/s

I have no idea what that "start: 4.266667" means or how I got it, but that's pretty much exactly how far behind the audio is from the video all the way through.

So I was looking at going back and deleting everything I had done, recording the video again, and hoping that it would be in sync this time, when I decided that I should google this on the off-chance that someone already knew how to fix it. Sure enough, google turned up this fabulous page. Thank you Howard Pritchett. I owe you one!

It turns out ffmpeg has a switch for doing exactly this: "-itsoffset". Here's the full command to fix my file (and turn it into a flash movie).

ffmpeg -i out.ogg -itsoffset 4.267 -i out.ogg -map 1:0 -map 0:1 -ar 22050 video.flv

A couple of things to notice.

ffmpeg -i out.ogg -itsoffset 4.267 -i out.ogg -map 1:0 -map 0:1 -ar 22050 video.flv

First we have to specify the same input file twice; we're going to use one "copy" for the audio and one for video. We have to list it twice here because we're asking ffmpeg to delay only one part of the entire file, the video stream. We want the audio stream to run normally. The "-itsoffset" switch applies to the input which come AFTER it. So "-itsoffset 4.267" tells ffmpeg to "delay" processing the second input file by 4.267 seconds.

Then we need to tell ffmpeg how to recombine these streams - audio and video - back into one file. That's the "-map" switch.

Remember what our "ffmpeg -i out.ogg" command showed us about the streams in the file:

Stream #0.0: Video: theora, yuv420p, 960x576, 15.00 fps(r)
Stream #0.1: Audio: vorbis, 16000 Hz, mono, 99 kb/s

That's the video and audio streams in this file. If I list the same file twice, you'll see we get a second input:

$ffmpeg -i out.ogg -i out.ogg
Input #0, ogg, from 'out.ogg':
Stream #0.0: Video: theora, yuv420p, 960x576, 15.00 fps(r)
Stream #0.1: Audio: vorbis, 16000 Hz, mono, 99 kb/s
...
Input #1, ogg, from 'out.ogg':
Stream #1.0: Video: theora, yuv420p, 960x576, 15.00 fps(r)
Stream #1.1: Audio: vorbis, 16000 Hz, mono, 99 kb/s

The stream numbering system is INPUT_FILE.STREAM. Ffmpeg starts numbering from 0, so the first stream in the first input file (0.0) is the video, and the second stream in the first input file (0.1) is the audio. The second input file has the same two streams in the same order, of course, but they are listed as 1.0 and 1.1 since they come from the second input file.

So we want ffmpeg to take the first input file as the audio, and the second as the video, since we need to delay the video and we've applied the delay to the second input file. The "-map" switches should be listed in the order you want the streams to appear in the output file. Since I know that a flash movie has two streams, video first, then audio, (which you can see using "ffmpeg -i movie.flv"), the first map switch specifies the video, and the second specifies the audio.

ffmpeg -i out.ogg -itsoffset 4.267 -i out.ogg -map 1:0 -map 0:1 -ar 22050 video.flv

Here "-map 1:0" tells ffmpeg to use the second input file (1) and take its first stream (0) as the video source. Since that's the input that follows the "-itsoffset" switch, that's the stream that will be delayed. Then "-map 0.1" tells it to use the second stream from the first input as the audio. Since there's no offset applied to the first input file, it will be processed normally. (Finally we change the audio sample rate to something suitable for a flash movie and output a file called "video.flv". Ffmpeg will see the ".flv" and apply the proper codecs for the audio and video to make a ".flv" file.)

Sure enough, it works beautifully. The video and audio are now in sync, and I have a nice flash movie to upload to our website where I can direct the client to watch it.

Check out the full web page at http://howto-pages.org/ffmpeg/ for all sorts of other wonderful things that you can do with ffmpeg.