1. Overview

Merging videos can be helpful when presenting or editing. Depending on the context, a different type of merge can be used.

In this tutorial, we’ll learn how to join two videos side-by-side into a unified video. For that purpose, we’ll make use of FFmpeg‘s hstack and overlay complex filters. Then, we’ll dive into merging the audio streams of the videos. Finally, we’ll add enhancements such as padding and borders to the video streams.

2. Stitching Videos

In FFmpeg, we can use a combination of filters with the -filter_complex option to place videos side by side appropriately. Specifically, filter_complex lets us combine multiple complex filters for audio and video streams simultaneously.

There are two such complex filters to stitch videos together: hstack and overlay. We can use either filter and the end result should be the same.

For this example, we’ve chosen two small videos that are 10 seconds long and have the same resolution.

2.1. hstack

hstack stands for horizontal stack and enables us to stitch two or more videos horizontally. As a result, the output video displays multiple videos side by side.

As an example, let’s use the hstack filter to stitch two videos:

$ ffmpeg -i v1.mp4 -i v2.mp4 -filter_complex "[0:v][1:v]hstack[out]" -map "[out]" output.mp4

This command combines the video streams of two input files horizontally with the hstack filter, creating a new output file.

The -i option specifies the input files, in this case, v1.mp4 and v2.mp4. The -map option determines which streams should be included in the output file.

Let’s break down the filters specified with the -filter_complex option:

  • [0:v] and [1:v] refer to the first and second video streams from the first and second inputs
  • hstack specifies the hstack filter that combines these video streams horizontally
  • [out] is the label we assign to the output video stream

Finally, we can see the output video:

Notably, the resulting video is re-encoded because we can’t copy the contents from the streams when using a complex filtergraph.

Alternatively, if we have two videos, we can make the command simpler:

$ ffmpeg -i v1.mp4 -i v2.mp4 -filter_complex hstack output.mp4

Furthermore, for more than two videos, we can use hstack=inputs=N, where N is the number of inputs:

$ ffmpeg -i v1.mp4 -i v2.mp4 -i v3.mp4 -filter_complex hstack=inputs=3 output.mp4

This command stitches three input videos horizontally.

2.2. overlay

The overlay filter overlays one video on top of another and can manipulate attributes like transparency, size, and position of the overlaid stream.

First, let’s see the general syntax of the overlay filter:

[base][overlay]overlay=x:y[out]

To be specific, let’s see what each part means:

  • [base] is the base video stream
  • [overlay] is the video to be overlaid on the base stream
  • overlay=x:y specifies the position of the overlaid video relative to the top-left corner of the base video
  • [out] is the label for the output video stream

So, if we have two videos, we can simply overlay the second input on the first input:

$ ffmpeg -i v1.mp4 -i v2.mp4 -filter_complex '[0:v]pad=iw*2:ih[res];[res][1:v]overlay=W/2:0[out]' -map '[out]' output.mp4

This command takes two input video files and uses filters to pad the first video to double its original width while maintaining the original height. Then, it overlays the second video on top of the padded area, positioning both videos side-by-side.

Let’s take a close look at the filter:

  • [0:v]pad=iw*2:ih[int] applies the pad filter to double the original width (iw*2) while maintaining the original height
  • [res] stores the intermediate result
  • [res][1:v]overlay=W/2:0 takes the [res] stream from the previous step and overlays the second input
  • [out] is the label that we assigned to the output video stream

Once we run the command, FFmpeg encodes the resulting video:

As we can see, the result is similar to the one produced by hstack.

3. Merging Audio

In addition to video streams, we can also combine audio streams from all inputs into a single, multi-channel audio stream using the amerge filter:

$ ffmpeg -i v1.mp4 -i v2.mp4 \
   -filter_complex "[0:v][1:v]hstack=inputs=2[v]; \
   [0:a][1:a]amerge[a]" \
   -map "[v]" -map "[a]" -ac 2 output.mp4

Let’s dive into this command:

  • [0:v][1:v]hstack=inputs=2[v] stacks the video streams and labels the result as v
  • [0:a][1:a]amerge[a] takes the audio streams from both inputs and merges them into a single audio stream called a
  • -map “[v]” -map “[a]” combines the two streams
  • -ac 2 creates two audio channels to make it stereo

If we omit -ac 2, the resulting audio stream has four channels, preserving the channels from both inputs. Alternatively, we can add -an to completely remove the audio streams from the output.

4. Background Enhancements

In addition to stitching videos together, we can also add padding and borders to enhance the clarity and visual appeal of the video.

4.1. Padding

Padding introduces margins around a section or sections of a given video.

We add padding to the videos using the pad filter:

$ ffmpeg -i v1.mp4 -i v2.mp4 -filter_complex \
  "[0:v]pad=iw+10:ih+20:10:10[v1]; \
   [1:v]pad=iw+20:ih+20:10:10[v2]; \
   [v1][v2]hstack=inputs=2[v]" -map "[v]" output.mp4

Let’s see a breakdown of the pad filter applied to the first video stream ([0:v]):

  • iw+10 increases the original width of the video by 10 pixels
  • ih+20 increases the original height of the video by 20 pixels
  • 10:10 adds 10 pixels padding to the left, top, and bottom of the video

The second video stream ([1:v]) undergoes the same padding filter process, followed by combining both videos horizontally using the hstack filter.

After executing the command, we get a video with the given padding:

Now, if we look closely, we can see that the videos are padded 10px from all sides. Further, the padded area is colored black.

4.2. Borders

We can go further by adding color to the padded area, creating a border-like appearance:

$ ffmpeg -i v1.mp4 -i v2.mp4 -filter_complex \
  "[0:v]pad=iw+10:ih+20:10:10:yellow[v1]; \
   [1:v]pad=iw+20:ih+20:10:10:yellow[v2]; \
   [v1][v2]hstack=inputs=2[v]" -map "[v]" output.mp4

In the command, we specified the pad filter to color the padded area yellow. Alternatively, we can also specify color codes using hex codes or RGB values.

Lastly, let’s see the final result:

As expected, the result is similar to the simple padding case, but uses yellow as the color.

5. Conclusion

In the article, we discussed how we can stack multiple videos horizontally with FFmpeg. We learned how to use the hstack and overlay complex filters and the amerge audio filter to merge the audio streams.

Lastly, we saw how to add simple enhancements to the stacked videos.