Synopsis

Here, we provide some tips and tricks for that can be useful for working with video in visualization tasks and workflows.

Tools

A good tool for video processing is FFmpeg, and is installed on Snellius. It is a swiss-army-knife for video processing, but all that power can make it somewhat difficult to use, mostly to find the right options and their order (the latter matters). Plus, there is quite a lot of technical detail involved when working with video files.

Libav

Some Linux distributions only provide, for license reasons, a fork of FFmpeg called Libav. In principle Libav is compatible with FFmpeg, but the command "ffmpeg" is named "avconv", "ffplay" is called "avplay", "ffprobe" called "avprobe", etc.

GUI application

If you're looking for a tool that's somewhat easier to use and has a GUI then HandBrake could be an option. It provides most of the often-used functionalities that FFmpeg has, but is easier to use, once you understand the workflow in the GUI. It also allows multiple processing operations to be scheduled in the background, which can be interesting as video processing can be a time-consuming process.

Creating a video from a sequence of images

Converting a sequence of images frame0000.png, frame0001.png, ... into a video file can be done like this:

ffmpeg -y -r 30 -i frame%04d.png -c:v libx264 -crf 20 -v:profile main -pix_fmt yuv420p video.mp4

Some remarks on the different options used:

  • -y will answer any interactive prompt that FFmpeg would show automatically with "yes". The most frequent situation where -y  matters is to automatically overwrite any existing output file, especially when trying different values for the options above. If the -y option is not given then FFmpeg will ask before overwriting.
  • -r 30 sets the framerate of the image sequence, in frames per second (FPS), 30 in this example. As FFmpeg has now way to know what FPS you want the video file to be from just the set of images you need to specify this. Often-used options are 15, 30 or 60, with higher numbers leading to smoother (but shorter) videos. Note that the -r  option need to be specified before the -i  option.
  • -i frame%04d.png  specifies the file name pattern of the input images. Here, we used PNG image files as input, but FFmpeg supports a wide variety of formats, and will auto-detect the format used. The pattern marker %04d is similar to the C printf() function and means FFmpeg will search for files named frame0000.png, frame0001.png, ....  Of course, a different width for the number pattern can be specified, for example %02d (000, 001, 002, ...) or %d (0, 1, 2, ...).
  • The sequence numbers for the frames specified by -i  by default start at either 0 or 1 (depending on which file is found first). For setting an explicit start number, use -start_number <number>. Note that the -start_number  option needs to come before the -i  option.  An alternative pattern is to use globbing: -pattern_type glob -i 'frame*.png'
  • -c:v libx264  specifies the video codec to use to compress the video stream. There are a lot of possible choices, but libx264 is a good one and produces an H.264 video stream. Together with the profile and pixel format options (see below) this produces a video file that should play back on all modern devices and also directly in a browser (i.e. when hosting the video file on a web page). More information on encoding options for H.264 can be found here.
  • -crf 20 specifies that "constant rate factor" encoding is to be used. This means that FFmpeg will try to output a video file that has a more-or-less fixed quality for each frame. It will vary the bitrate used for each frame as needed. Using the -crf provides a fairly simple way of producing videos in a desired quality, for cases where the resulting video file size is not very important.

    Using lower CRF values results in a higher-quality video (at the expense of lower compression and thus a larger video file), higher CRF values lead to lower-quality video (and a smaller video file). A CRF value of 23 is the default, while the CRF range is 0-51. You can experiment with different CRF values to see which produces a video quality that you can live with.


    The alternative would be "variable rate encoding", which is mostly interesting when a specific video file size is being targeted. See here for more details.
  • -v:profile main sets the H.264 profile for the video stream. In general, main  is a good choice that will play back on many devices, while high  will provide somewhat better compression but at the cost of demanding more features of the device doing the playback (i.e. PC, tablet, mobile phone).
  • -pix_fmt yuv420p sets the pixel format of the compressed video stream, and in general needs to be set to yuv420p to allow playback on most devices.
  • video.mp4  apart from setting the file name of output video, this also sets the container type, in this case to MP4. The container formats determines how video (and audio) streams are stored in the video file. In general, MP4 is a fine choice, but alternatives could be Matroska (video.mkv) or AVI (video.avi), and a few others.

Turning a video file into a sequence of images

The opposite to the previous task is to take a video file and write a separate image file for each frame in the video. This is easily accomplished with:

ffmpeg -i video.mp4 image%04d.png

Note that this can generate a large number of image files! For example, one minute of video at 30 frames per second will result in 1,800 image files.

See the previous section for the meaning of the %04d pattern marker. Note that the extension of the image file determines in what format the images are written.

Getting details of the contents of a video file

A video file, in a format such as MP4 or AVI, is really a container that holds video and audio streams. Sometimes you want to check what exactly is present in a video file, or to check if the generation of a video file produced the correct output. FFmpeg can show this using either of these two commands:

ffmpeg -i video.mp4
ffprobe video.mp4

Here's the output for an MP4 file containing a H.264 encoded video stream (main profile, yuv420p pixel format, 1920x1080 pixels, 60 frames per second), plus an audio stream (AAC encoding, 48 KHz, stereo, 2 kbit/s):

melis@juggle 18:03:~$ ffprobe 20210303-slurm-view-cartesius.mp4
ffprobe version n5.1 Copyright (c) 2007-2022 the FFmpeg developers
  built with gcc 12.1.1 (GCC) 20220730
  configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmfx --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librav1e --enable-librsvg --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-nvdec --enable-nvenc --enable-shared --enable-version3
  libavutil      57. 28.100 / 57. 28.100
  libavcodec     59. 37.100 / 59. 37.100
  libavformat    59. 27.100 / 59. 27.100
  libavdevice    59.  7.100 / 59.  7.100
  libavfilter     8. 44.100 /  8. 44.100
  libswscale      6.  7.100 /  6.  7.100
  libswresample   4.  7.100 /  4.  7.100
  libpostproc    56.  6.100 / 56.  6.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '20210303-slurm-view-cartesius.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.45.100
  Duration: 00:01:01.55, start: 0.000000, bitrate: 2165 kb/s
  Stream #0:0[0x1](und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 2150 kb/s, 60 fps, 60 tbr, 15360 tbn (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 2 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]