Stable order for outputs
Data structures such as Perl hashes, Python dictionaries and sets, Rust std::collections::HashMap and std::collections::Hashset, or Ruby Hash objects will list their keys in a different order on every run to limit algorithmic complexity attacks.
Perl
The following Perl code will output the list in a different order on every run:
foreach my $package (keys %deps) {
print MANIFEST, "$package: $deps[$packages]";
}To get a deterministic output, the easiest way is to explicitly sort the keys:
foreach my $package (sort keys %deps) {
print MANIFEST, "$package: $deps[$packages]";
}For Perl, it is possible to set PERL_HASH_SEED=0 in the environment. This
will result in hash keys always being in the same order. See
perlrun(1) for more information.
Python
Python users can similarly set the environment variable PYTHONHASHSEED. When set to a given integer value, orders in dictionaries and sets will be the same on every run.
Rust
When iterating over the keys or entries of a HashMap, the order is
explicitly
undefined
and depends on a random seed:
By default, HashMap uses a hashing algorithm selected to provide resistance against HashDoS attacks. The algorithm is randomly seeded, and a reasonable best-effort is made to generate this seed from a high quality, secure source of randomness provided by the host without blocking the program.
Iterating over a HashMap can cause reproducible builds issue when:
- done inside a
build.rsfile - done in a function that’s directly or indirectly called from a
build.rsfile
This can often be fixed by replacing HashMap with
BTreeMap.
There’s a real-world
example of how this was
fixed.
General
Beware that the locale settings
might affect the output of some sorting functions or the sort command.
Introduction
- Which problems do Reproducible Builds Solve?
- Definitions
- History
- Why reproducible builds?
- Making plans
- Academic publications
Achieve deterministic builds
Managing variance
- Variations in the build environment
- SOURCE_DATE_EPOCH
- Deterministic build systems
- Volatile inputs can disappear
- Stable order for inputs
- Stripping of unreproducible information
- Value initialization
- Version information
- Timestamps
- Timezones
- Locales
- Archive metadata
- Stable order for outputs
- Randomness
- Build path
- System images
- JVM
- Helm
Define a build environment
- What's in a build environment?
- Recording the build environment
- Definition strategies
- Proprietary operating systems