Welcome to the July 2020 report from the Reproducible Builds project.
In these monthly reports, we round-up the things that we have been up to over the past month. As a brief refresher, the motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced from the original free software source code to the pre-compiled binaries we install on our systems. (If you’re interested in contributing to the project, please visit our main website.)
At the upcoming DebConf20 conference (now being held online), Holger Levsen will present a talk on Thursday 27th August about “Reproducing Bullseye in practice”, focusing on independently verifying that the binaries distributed from
ftp.debian.org were made from their claimed sources.
Tavis Ormandy published a blog post making the provocative claim that “You don’t need reproducible builds”, asserting elsewhere that the many attacks that have been extensively reported in our previous reports are “fantasy threat models”. A number of rebuttals have been made, including one from long-time contributor Reproducible Builds contributor Bernhard Wiedemann.
On our mailing list this month, Debian Developer Graham Inggs posted to our list asking for ideas why the
openorienteering-mapper Debian package was failing to build on the Reproducible Builds testing framework. Chris Lamb remarked from the build logs that the package may be missing a build dependency, although Graham then used our own diffoscope tool to show that the resulting package remains unchanged with or without it. Later, Nico Tyni noticed that the build failure may be due to the relationship between the
FILE C preprocessor macro and the
-ffile-prefix-map GCC flag.
An issue in Zephyr, a small-footprint kernel designed for use on resource-constrained systems, around
.a library files not being reproducible was closed after it was noticed that a key part of their toolchain was updated that now calls
--enable-deterministic-archives by default.
Reproducible Builds developer kpcyrd commented on a pull request against the libsodium cryptographic library wrapper for Rust, arguing against the testing of CPU features at compile-time. He noted that:
I’ve accidentally shipped broken updates to users in the past because the build system was feature-tested and the final binary assumed the instructions would be present without further runtime checks
David Kleuker also asked a question on our mailing list about using
SOURCE_DATE_EPOCH with the
install(1) tool from GNU coreutils. When comparing two installed packages he noticed that the filesystem ‘birth times’ differed between them. Chris Lamb replied, realising that this was actually a consequence of using an outdated version of diffoscope and that a fix was in diffoscope version 146 released in May 2020.
Lastly, Holger Levsen updated the
README file […], marked the Alpine Linux continuous integration tests as currently disabled […] and linked the Arch Linux Reproducible Status page from our projects page […].
diffoscope is our in-depth and content-aware diff utility that can not only locate and diagnose reproducibility issues, it provides human-readable diffs of all kinds. In July, Chris Lamb made the following changes to diffoscope, including releasing versions
- Add support for flash-optimised F2FS filesystems. (#207)
- Don’t require
zipnote(1)to determine differences in a
.zipfile as we can use
--profileas a synonym for
--profile=-, ie. write profiling data to standard output. […]
- Increase the minimum length of the output of
strings(1)to eight characters to avoid unnecessary diff noise. […]
- Drop some legacy argument styles:
--no-exclude-directory-metadatahave been replaced with
- Pass the absolute path when extracting members from SquashFS images as we run the command with working directory in a temporary directory. (#189)
- Correct adding a comment when we cannot extract a filesystem due to missing libguestfs module. […]
- Don’t crash when listing entries in archives if they don’t have a listed size such as hardlinks in ISO images. (#188)
- Strip off the file offset prefix from
xxd(1)and show bytes in groups of 4. […]
- Don’t emit
javap not found in pathif it is available in the path but it did not result in an actual difference. […]
... not available in pathmessages when looking for Java decompilers that used the Python class name instead of the command. […]
- Strip off the file offset prefix from
- Rewrite and rename
exit_if_paths_do_not_existto not check files multiple times. […][…]
- Add an
add_commenthelper method; don’t mess with our internal list directly. […]
- Replace some simple usages of
str.formatwith Python ‘f-strings’ […] and make it easier to navigate to the
main.pyentry point […].
- In the RData comparator, always explicitly return
Nonein the failure case as we return a non-
Nonevalue in the success one. […]
- Tidy some imports […][…][…] and don’t alias a variable when we do not use it. […]
- Clarify the use of a separate
NullChangesquasi-file to represent missing data in the Debian package comparator […] and clarify use of a ‘null’ diff in order to remember an exit code. […]
- Rewrite and rename
Jean-Romain Garnier also made the following changes:
- Allow passing a file with a list of arguments via
diffoscope @args.txt. (!62)
- Improve the output of side-by-side diffs by detecting added lines better. (!64)
- Remove offsets before instructions in
objdump[…][…] and remove raw instructions from ELF tests […].
strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. It is used automatically in most Debian package builds. In July, Chris Lamb ensured that we did not install the internal handler documentation generated from Perl POD documents […] and fixed a trivial typo […]. Marc Herbert added a
--verbose-level warning when the Archive::Cpio Perl module is missing. (!6)
reprotest is our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences. This month, Vagrant Cascadian made a number of changes to support diffoscope version 153 which had removed the (deprecated)
--no-exclude-directory-metadata command-line arguments, and updated the testing configuration to also test under Python version 3.8 […].
In June 2020, Timo Röhling filed a wishlist bug against the
debhelper build tool impacting the reproducibility status of hundreds of packages that use the CMake build system. This month however, Niels Thykier uploaded
debhelper version 13.2 that passes the
-DBUILD_RPATH_USE_ORIGIN=ON arguments to CMake when using the (currently-experimental) Debhelper compatibility level 14.
According to Niels, this change:
… should fix some reproducibility issues, but may cause breakage if packages run binaries directly from the build directory.
34 reviews of Debian packages were added, 14 were updated and 20 were removed this month adding to our knowledge about identified issues. Chris Lamb added and categorised the
nondeterministic_order_of_debhelper_snippets_added_by_dh_fortran_mod […] and
gem2deb_install_mkmf_log […] toolchain issues.
Bernhard also published the results of performing 12,235 verification builds of packages from openSUSE Leap version 15.2 and, as a result, created three pull requests against the openSUSE Build Result Compare Script […][…][…].
In Arch Linux, there was a mass rebuild of old packages in an attempt to make them reproducible. This was performed because building with a previous release of the pacman package manager caused file ordering and size calculation issues when using the btrfs filesystem.
A system was also implemented for Arch Linux packagers to receive notifications if/when their package becomes unreproducible, and packagers now have access to a dashboard where they can all see all their unreproducible packages (more info).
Paul Spooren sent two versions of a patch for the OpenWrt embedded distribution for adding a ‘build system’ revision to the ‘packages’ manifest so that all external feeds can be rebuilt and verified. […][…]
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of these patches, including:
Bernhard M. Wiedemann:
afl(fix an incorrectly built manual page varied from kernel boot options)
dnscrypt-proxy(sort the output of
graphviz(timezone issue, forwarded from Debian)
insighttoolkit(prevent CPU detection, forwarded upstream
ipopt(parallelism issue and use https://tracker.debian.org/pkg/strip-nondeterminism)
jboss-logging-tools(date, forwarded upstream)
lcov(date issue, already upstream)
multus(date issue, already upstream)
paperjam(date issue, forwarded upstream)
python-PyNaCl(sort Python glob/readdir)
python-enaml(workaround an open upstream Python issue)
sac(omit creation time from
sql-parser(sort, already upstream)
ugrep(CPU-related issue, already upstream)
unknown-horizons(filesystem ordering issue, already upstream)
unknown-horizons(filesystem ordering issue)
Vagrant Cascadian also reported two issues, the first regarding a regression in u-boot boot loader reproducibility for a particular target […] and a non-deterministic segmentation fault in the guile-ssh test suite […]. Lastly, Jelle van der Waa filed a bug against the MeiliSearch search API to report that it embeds the current build date.
This month, Holger Levsen made the following changes:
- Tweak the rescheduling of various architecture and suite combinations. […][…]
- Fix links for ‘404’ and ‘not for us’ icons. (#959363)
- Further work on a rebuilder prototype, for example correctly processing the
sbuildexit code. […][…]
- Update the sudo configuration file to allow the node health job to work correctly. […]
php-hordepackages back to the
pkg-php-pearpackage set for the bullseye distribution. […]
- Update the version of
System health check development:
- Add checks for broken SSH […],
pbuilder[…], NetBSD […], ‘unkillable’ processes […], unresponsive nodes […][…][…][…], proxy connection failures […], too many installed kernels […], etc.
- Automatically fix some failed
- Add notes explaining all the issues that hosts are experiencing […] and handle zipped job log files correctly […].
- Separate nodes which have been automatically marked as down […] and show status icons for jobs with issues […].
- Add checks for broken SSH […],
In addition, Mattia Rizzolo updated the
init_node script to suggest using sudo instead of explicit logout and logins […][…] and the usual build node maintenance was performed by Holger Levsen […][…][…][…][…][…], Mattia Rizzolo […][…] and Vagrant Cascadian […][…][…][…].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via: