Welcome to the July 2020 report from the Reproducible Builds project.
In these monthly reports, we round-up the things that we have been up to over the past month. As a brief refresher, the motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced from the original free software source code to the pre-compiled binaries we install on our systems. (If you’re interested in contributing to the project, please visit our main website.)
General news
At the upcoming DebConf20 conference (now being held online), Holger Levsen will present a talk on Thursday 27th August about “Reproducing Bullseye in practice”, focusing on independently verifying that the binaries distributed from ftp.debian.org
were made from their claimed sources.
Tavis Ormandy published a blog post making the provocative claim that “You don’t need reproducible builds”, asserting elsewhere that the many attacks that have been extensively reported in our previous reports are “fantasy threat models”. A number of rebuttals have been made, including one from long-time contributor Reproducible Builds contributor Bernhard Wiedemann.
On our mailing list this month, Debian Developer Graham Inggs posted to our list asking for ideas why the openorienteering-mapper
Debian package was failing to build on the Reproducible Builds testing framework. Chris Lamb remarked from the build logs that the package may be missing a build dependency, although Graham then used our own diffoscope tool to show that the resulting package remains unchanged with or without it. Later, Nico Tyni noticed that the build failure may be due to the relationship between the FILE
C preprocessor macro and the -ffile-prefix-map
GCC flag.
An issue in Zephyr, a small-footprint kernel designed for use on resource-constrained systems, around .a
library files not being reproducible was closed after it was noticed that a key part of their toolchain was updated that now calls --enable-deterministic-archives
by default.
Reproducible Builds developer kpcyrd commented on a pull request against the libsodium cryptographic library wrapper for Rust, arguing against the testing of CPU features at compile-time. He noted that:
I’ve accidentally shipped broken updates to users in the past because the build system was feature-tested and the final binary assumed the instructions would be present without further runtime checks
David Kleuker also asked a question on our mailing list about using SOURCE_DATE_EPOCH
with the install(1)
tool from GNU coreutils. When comparing two installed packages he noticed that the filesystem ‘birth times’ differed between them. Chris Lamb replied, realising that this was actually a consequence of using an outdated version of diffoscope and that a fix was in diffoscope version 146 released in May 2020.
Later in July, John Scott posted asking for clarification regarding on the Javascript files on our website to add metadata for LibreJS, the browser extension that blocks non-free Javascript scripts from executing. Chris Lamb investigated the issue and realised that we could drop a number of unused Javascript files […][…][…] and added unminified versions of Bootstrap and jQuery […].
Development work
Website
On our website this month, Chris Lamb updated the main Reproducible Builds website and documentation to drop a number of unused Javascript files […][…][…] and added unminified versions of Bootstrap and jQuery […]. He also fixed a number of broken URLs […][…].
Gonzalo Bulnes Guilpain made a large number of grammatical improvements […][…][…][…][…] as well as some misspellings, case and whitespace changes too […][…][…].
Lastly, Holger Levsen updated the README
file […], marked the Alpine Linux continuous integration tests as currently disabled […] and linked the Arch Linux Reproducible Status page from our projects page […].
diffoscope
diffoscope is our in-depth and content-aware diff utility that can not only locate and diagnose reproducibility issues, it provides human-readable diffs of all kinds. In July, Chris Lamb made the following changes to diffoscope, including releasing versions 150
, 151
, 152
, 153
& 154
:
-
New features:
- Add support for flash-optimised F2FS filesystems. (#207)
- Don’t require
zipnote(1)
to determine differences in a.zip
file as we can uselibarchive
. […] - Allow
--profile
as a synonym for--profile=-
, ie. write profiling data to standard output. […] - Increase the minimum length of the output of
strings(1)
to eight characters to avoid unnecessary diff noise. […] - Drop some legacy argument styles:
--exclude-directory-metadata
and--no-exclude-directory-metadata
have been replaced with--exclude-directory-metadata={yes,no}
. […]
-
Bug fixes:
- Pass the absolute path when extracting members from SquashFS images as we run the command with working directory in a temporary directory. (#189)
- Correct adding a comment when we cannot extract a filesystem due to missing libguestfs module. […]
- Don’t crash when listing entries in archives if they don’t have a listed size such as hardlinks in ISO images. (#188)
-
Output improvements:
- Strip off the file offset prefix from
xxd(1)
and show bytes in groups of 4. […] - Don’t emit
javap not found in path
if it is available in the path but it did not result in an actual difference. […] - Fix
... not available in path
messages when looking for Java decompilers that used the Python class name instead of the command. […]
- Strip off the file offset prefix from
-
Logging improvements:
- Add a bit more debugging info when launching libguestfs. […]
- Reduce the
--debug
log noise by truncating thehas_some_content
messages. […] - Fix the
compare_files
log message when the file does not have a literal name. […]
-
Codebase improvements:
- Rewrite and rename
exit_if_paths_do_not_exist
to not check files multiple times. […][…] - Add an
add_comment
helper method; don’t mess with our internal list directly. […] - Replace some simple usages of
str.format
with Python ‘f-strings’ […] and make it easier to navigate to themain.py
entry point […]. - In the RData comparator, always explicitly return
None
in the failure case as we return a non-None
value in the success one. […] - Tidy some imports […][…][…] and don’t alias a variable when we do not use it. […]
- Clarify the use of a separate
NullChanges
quasi-file to represent missing data in the Debian package comparator […] and clarify use of a ‘null’ diff in order to remember an exit code. […]
- Rewrite and rename
-
Other changes:
Jean-Romain Garnier also made the following changes:
- Allow passing a file with a list of arguments via
diffoscope @args.txt
. (!62) - Improve the output of side-by-side diffs by detecting added lines better. (!64)
- Remove offsets before instructions in
objdump
[…][…] and remove raw instructions from ELF tests […].
Other tools
strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. It is used automatically in most Debian package builds. In July, Chris Lamb ensured that we did not install the internal handler documentation generated from Perl POD documents […] and fixed a trivial typo […]. Marc Herbert added a --verbose
-level warning when the Archive::Cpio Perl module is missing. (!6)
reprotest is our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences. This month, Vagrant Cascadian made a number of changes to support diffoscope version 153 which had removed the (deprecated) --exclude-directory-metadata
and --no-exclude-directory-metadata
command-line arguments, and updated the testing configuration to also test under Python version 3.8 […].
Distributions
Debian
In June 2020, Timo Röhling filed a wishlist bug against the debhelper
build tool impacting the reproducibility status of hundreds of packages that use the CMake build system. This month however, Niels Thykier uploaded debhelper
version 13.2 that passes the -DCMAKE_SKIP_RPATH=ON
and -DBUILD_RPATH_USE_ORIGIN=ON
arguments to CMake when using the (currently-experimental) Debhelper compatibility level 14.
According to Niels, this change:
… should fix some reproducibility issues, but may cause breakage if packages run binaries directly from the build directory.
34 reviews of Debian packages were added, 14 were updated and 20 were removed this month adding to our knowledge about identified issues. Chris Lamb added and categorised the nondeterministic_order_of_debhelper_snippets_added_by_dh_fortran_mod
[…] and gem2deb_install_mkmf_log
[…] toolchain issues.
Lastly, Holger Levsen filed two more wishlist bugs against the debrebuild
Debian package rebuilder tool […][…].
openSUSE
In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update.
Bernhard also published the results of performing 12,235 verification builds of packages from openSUSE Leap version 15.2 and, as a result, created three pull requests against the openSUSE Build Result Compare Script […][…][…].
Other distributions
In Arch Linux, there was a mass rebuild of old packages in an attempt to make them reproducible. This was performed because building with a previous release of the pacman package manager caused file ordering and size calculation issues when using the btrfs filesystem.
A system was also implemented for Arch Linux packagers to receive notifications if/when their package becomes unreproducible, and packagers now have access to a dashboard where they can all see all their unreproducible packages (more info).
Paul Spooren sent two versions of a patch for the OpenWrt embedded distribution for adding a ‘build system’ revision to the ‘packages’ manifest so that all external feeds can be rebuilt and verified. […][…]
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of these patches, including:
-
Bernhard M. Wiedemann:
afl
(fix an incorrectly built manual page varied from kernel boot options)brp-check-suse
(sorting issue)dnscrypt-proxy
(sort the output offind(1)
)graphviz
(timezone issue, forwarded from Debian)guile-gcrypt
(parallelism)insighttoolkit
(prevent CPU detection, forwarded upstreamipopt
(parallelism issue and use https://tracker.debian.org/pkg/strip-nondeterminism)jboss-logging-tools
(date, forwarded upstream)kismet
(date)lcov
(date issue, already upstream)multus
(date issue, already upstream)multus
(date)paperjam
(date issue, forwarded upstream)pspp
(scrubtestsuite.log
)python-PyNaCl
(sort Python glob/readdir)python-enaml
(workaround an open upstream Python issue)sac
(omit creation time from.zip
files)sql-parser
(sort, already upstream)ugrep
(CPU-related issue, already upstream)ugrep
(CPU-related issue)unknown-horizons
(filesystem ordering issue, already upstream)unknown-horizons
(filesystem ordering issue)xfce4-panel-profiles
(POSIX.1-2001/pax headers)yast2-sound
(usesuname -r
)
-
Chris Lamb:
-
Guillaume Nodet:
apache-sshd
(date)
Vagrant Cascadian also reported two issues, the first regarding a regression in u-boot boot loader reproducibility for a particular target […] and a non-deterministic segmentation fault in the guile-ssh test suite […]. Lastly, Jelle van der Waa filed a bug against the MeiliSearch search API to report that it embeds the current build date.
Testing framework
We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org
.
This month, Holger Levsen made the following changes:
-
Debian-related changes:
- Tweak the rescheduling of various architecture and suite combinations. […][…]
- Fix links for ‘404’ and ‘not for us’ icons. (#959363)
- Further work on a rebuilder prototype, for example correctly processing the
sbuild
exit code. […][…] - Update the sudo configuration file to allow the node health job to work correctly. […]
- Add
php-horde
packages back to thepkg-php-pear
package set for the bullseye distribution. […] - Update the version of
debrebuild
. […]
-
System health check development:
- Add checks for broken SSH […],
logrotate
[…],pbuilder
[…], NetBSD […], ‘unkillable’ processes […], unresponsive nodes […][…][…][…], proxy connection failures […], too many installed kernels […], etc. - Automatically fix some failed
systemd
units. […] - Add notes explaining all the issues that hosts are experiencing […] and handle zipped job log files correctly […].
- Separate nodes which have been automatically marked as down […] and show status icons for jobs with issues […].
- Add checks for broken SSH […],
-
Misc:
In addition, Mattia Rizzolo updated the init_node
script to suggest using sudo instead of explicit logout and logins […][…] and the usual build node maintenance was performed by Holger Levsen […][…][…][…][…][…], Mattia Rizzolo […][…] and Vagrant Cascadian […][…][…][…].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
-
IRC:
#reproducible-builds
onirc.oftc.net
. -
Twitter: @ReproBuilds
-
Mastodon: @reproducible_builds@fosstodon.org
-
Reddit: /r/ReproducibleBuilds
-
Mailing list:
rb-general@lists.reproducible-builds.org