Timestamps make the biggest source of reproducibility issues. Many build tools fancy recording the current date and time. The filesystem does, and most archive formats will happily record modification times on top of their own timestamps. It is also customary to record the date of the build in the software itself…
Timestamps are best avoided
Often the time of the build was used as an approximate way to know which version of the source has been built, and which tools had been used to do it. With reproducible builds, recording the time of the build becomes meaningless: on one side, the source code needs to be tracked more accurately than just a timestamp, and on the other side, the build environment needs to be defined or extensively recorded.
If a date is required to give users an idea on when the software was made, it is better to use a date that is relevant to the source code instead of the build: old software can always be built later. Like version information, it’s best to extract such a date from the revision control system or from a changelog.
Some tools used in build processes, like code or documentation generators, write timestamps which will create unreproducible build products.
The Debian reproducible builds effort proposed the
SOURCE_DATE_EPOCH environment variable to address the problem. Tools
that support it1 will use its value—a number of seconds since January 1st
1970, 00:00 UTC—instead of the current date and time (when set). The
variable has been formally
the hope of wider adoption.
Changes required to support
SOURCE_DATE_EPOCH are usually fairly
small and easy to write. Patches for tools which don’t yet support the
environment variable have been usually well received and help all users
wanting reproducible builds.
In case where that is not possible, an option is to do post-processing on the output. The idea is to either remove the timestamps entirely or to normalize them to a predetermined date and time. strip-nondeterminism was designed as an extensible program to perform such normalization on various file formats.
Another option is to run these tools using
This library is loaded through the
LD_PRELOAD environment variable and
it will intercept function calls retrieving the current time of day. It will
reply instead with a predefined date and time. In some cases, it works
just fine and can solve problems without requiring many
changes to a given build system. But if any part of the build process is
relying on time differences, things will go wrong. One case
of bad interaction between
libfaketime and parallel
compilation has been identified as a source of reproducibility issue in
the Tor Browser. So beware.
Achieve deterministic builds
- Deterministic build systems
- Volatile inputs can disappear
- Stable order for inputs
- Value initialization
- Version information
- Archive metadata
- Stable order for outputs
- Build path
- System images
Define a build environment
- What's in a build environment?
- Recording the build environment
- Definition strategies
- Proprietary operating systems
Distribute the environment
Follow us on Twitter @ReproBuilds & Reddit and please consider making a donation. Content licensed under CC BY-SA 4.0, style licensed under MIT. Templates and styles based on the Tor Styleguide. Logos and trademarks belong to their respective owners. Patches welcome via our Git repository (instructions) or via our mailing list.