Risk assessment considerations: Using FFV1 for preservation

Started: September 27th, 2016
Last update: September 27th, 2016
Written by Peter B.,

Introduction

The Austrian Mediathek started using FFV1 for video preservation productively in 2010. In order to do so, it was necessary to address possible risks of using a non-standardized, hardly known format for this.
This article is a brief statement of what we considered - and how we assessed the risks.

The starting point was to write down, a list of key factors that a file format suitable for preservation, so that it fulfills archivist's standards, and increasing the chances of standing the tests of time:

no digital loss (lossless)
non proprietary
hardware independent
handleable data amount
resolution independent (SD, HD, and beyond...)
aspect ratio
colorspace/subsampling as native as possible

The key factor for FFV1 was, that it (a) fulfills the above needs - and (b) is Free Software / Open Source (FOSS). Looking at it closer, it's the fact that it's FOSS that plays a major role when it comes to its risk assessment.

We explained those points in greater detail in the "Project motivation" article in the DVA-Profession documentation

Risks considered

This list is short, but the devil is in the details:

Are the files we produce valid?
What if not?
Who can open them?
How to get out of this format?
How to sustain playback/transcoding?
What about data errors?

Q: Are the files we produce valid?

There was no conformance checker for FFV1 files back then, but we knew we could verify the actual code that was used to create the archived files.
When we started, there was only 1 source code base implementation for creating FFV1 files - worldwide: FFmpeg.
Therefore, any application that created FFV1 files was using the very same implementation. We ran framemd5 tests to verify the losslessness of the created files.

Today, there are still only 2 source code bases used for creating FFV1 files in the world: FFmpeg and LibAV.
Since both applications are not only useful commandline tools by itself, but also provide their abilities as code libraries. This enables a anyone else to handle all formats supported by FFmpeg/LibAV.
The major difference to proprietary implementations is, that everyone is free to use the very same implementations that are well-known, stable and used by existing applications.

This basically reduces interoperability problems to zero.

Nowadays, the MediaConch tool can be used for validating correct implementation of FFV1 in videofiles.

Q: What if not?

Usually, cases with files that don't conform to specification or behave abnormal, due to implementation differences, the only chance is to start black-box debugging, if a proprietary / closed source implementation was used to create the files.

If one should encounter invalid FFV1 files, chances are almost 100% that they were created using FOSS tools.
So one can simply give the source code to a (hired or in-house) developer, to see where things went wrong - and possibly fix them.

For a proper digitization setup, this case should never arise, since it is good practice to evaluate and fix these issues before going productive.

Q: Who can open them?

By definition, every application that uses FFmpeg or LibAV libraries supports FFV1 out-of-the-box.

Since FFmpeg/LibAV are well-known projects, their libraries are used at one point or another by almost any software tool that deals with audiovisual files these days. Due to the complexity of digital video, even proprietary vendors make use of FFmpeg/LibAV libraries in their products.
Therefore, we found that in practice more applications can read/write FFV1 than JPEG2000-lossless, for example.
Additionally, the Wikipedia article of FFV1 contains a list of applications supporting FFV1.

This list is far from complete, because as already mentioned, it would have to contain all applications and hardware using FFmpeg/LibAV.
Which are literally hundreds...

Q: How to get out of this format?

FFmpeg/LibAV are mainly built for transcoding material between different formats.
Therefore, it's trivial to encode to any format supported by FFmpeg/LibAV.

If one needs to transcode to a format not supported by these 2 projects (which is very unlikely), the reason most often is that the target format is proprietary or encumbered by licensing/patent issues.
If one still must transcode to such a format, it can be done using e.g. uncompressed as an intermediate format.

Since such encumbered formats should not be used for long-term preservation anyways, this case should be neglectible ;)

For automatic conversion, again, framemd5 can be used to verify lossless migration of the audiovisual contents to any new lossless target format. This also applies to rewrapping FFV1 into any new container.

Q: How to sustain playback/transcoding?

Having access not only to the source code of the tools creating FFV1, but also the right to use it for any purpose - defined by its Free Software license - gives an unmatched option to archives:

This is equal to archiving not only your replayers, but also its schematics and building components - including spare parts! ;)
This makes using a file format implemented in FOSS practically "virtually immortal". Which also, by the way, applies to any tools that are FOSS licensed.

In case, that FFV1 should become obsolete and unsupported by all tools (including FFmpeg/LibAV) in the future, one can take the source code and give it to a developer to make it run again. This can then be used to transcode the files to any format/technology then desired.

NOTE: The chances that FFmpeg/LibAV will drop support for FFV1 are highly unlikely, considering that these projects are also keeping code actively alive which is able to render rare codecs used on Amiga, C64 - and other operating systems that are now known to be found only in museums. This is another key benefit from a non-industry driven project.

Additionally, it's trivial to archive a copy not only of one source code version, but even the whole history: The source code "git" repository of FFmpeg/LibAV
This means that you get the schematics of every variant of the code that ever existed.

It's as easy as "git clone https://git.ffmpeg.org/ffmpeg.git".
See the list of git repositories on FFmpeg website for details.

Q: What about data errors?

Before FFV1.3 was finished, there was no error-resilience built into FFV1.

We calculated, that for the price of 1 uncompressed copy, we can have 3 copies of FFV1. Therefore if any bit goes bad, we have (a) more copies than for restoring uncompressed, and (b) it's quicker to restore FFV1 files due to their reduced size.

Additonally, we decided to segment the videos in individual files with 1500-frames each. This equals 1 minute of PAL material at 25fps.

More details about our considerations and experiences with segmenting the files can be found in the article "The archivists's video codec/container FAQ", me and my colleagues at the Mediathek published in the past.

Conclusion

After evaluating all this, we found that it was not only more cost-saving and faster to use FFV1, but overall safe to use for long-term preservation.
The fact that it is not yet standardized doesn't matter for technical accessiblity of the contents in this format, as shown above.

If you take a look at the issues we archivists have with other formats, it is mostly due to proprietary closed implementations being used to create them.
As I have deduced, and hopefully properly explained in the above mentioned points, these problems/risks can be circumvented by FOSS implementations and licensing.

If there is anything I've overlooked, please don't hesitate to let me know.