Adding build variance
Background
Rebuilding software on an individual machine and obtaining the same output does not guarantee that the software will always build reproducibly.
For example: a program that embeds the hostname of the build computer would not be bit-for-bit reproducible when built on systems with different hostnames.
Changing the compiler (and version) you use could also introduce differences. However, to achieve reproducible build results, it is generally acceptable to specify precise toolchain version(s) that other people should use when attempting to achieve an identical build.
Note: There are some conventions about factors that are acceptable to keep constant – these include the compiler, compiler version, and the versions of other software that the software depends upon. In contrast, there are other variable factors that we do expect to vary, and that we should accommodate when the software is rebuilt in diverse environments. It is a good idea to confirm that your software builds reproducibly in those environments too.
How to add variance to software builds
Tooling exists to systematically explore the factors that can affect build reproducibility, and we recommend re-using existing utilities rather than writing your own.
Reprotest
reprotest is a
tool that rebuilds a project in different environments automatically.
It can apply several variations, including build path, file order, locales, hostname, etc…
It includes native support for Debian and RPM package rebuilds, and can also be configured to run with other build systems.
Its
README
includes a variety of usage examples.
Factors that we would like to prevent from affecting the build output
- Software on your computer unrelated to the build process
- date, time
- language and regional settings
- CPU speed, number of cores, load of the build machine
- hostname, user name, build path
Factors that are usually acceptable to declare as constants
- Toolchain (compiler, …)
- Dependencies listed in your project
Introduction
- Which problems do Reproducible Builds Solve?
- Definitions
- History
- Why reproducible builds?
- Making plans
- Academic publications
Achieve deterministic builds
Managing variance
- Variations in the build environment
- SOURCE_DATE_EPOCH
- Deterministic build systems
- Volatile inputs can disappear
- Stable order for inputs
- Stripping of unreproducible information
- Value initialization
- Version information
- Timestamps
- Timezones
- Locales
- Archive metadata
- Stable order for outputs
- Randomness
- Build path
- System images
- JVM
Define a build environment
- What's in a build environment?
- Recording the build environment
- Definition strategies
- Proprietary operating systems