Welcome to the August 2024 report from the Reproducible Builds project!
Our reports attempt to outline what we’ve been up to over the past month, highlighting news items from elsewhere in tech where they are related. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.
Table of contents:
- LWN: The history, status, and plans for reproducible builds
- Intermediate Autotools build artifacts removed from PostgreSQL distribution tarballs
- Distribution news
- Mailing list news
- diffoscope
- Website updates
- Upstream patches
- Reproducibility testing framework
LWN: The history, status, and plans for reproducible builds
The free software newspaper of record, Linux Weekly News, published an in-depth article based on Holger Levsen’s talk, Reproducible Builds: The First Eleven Years which was presented at the recent DebConf24 conference in Busan, South Korea.
Titled The history, status, and plans for reproducible builds and written by Jake Edge, LWN’s article not only summarises Holger’s talk and clarifies its message but it links to external information as well. Holger’s original talk can also be watched on the DebConf24 webpage (direct .webm
link and his HTML slides are available also). There are also a significant number of comments on LWN’s page as well.
Holger Levsen also headed a scheduled discussion session at DebConf24 on Preserving *other* build artifacts addressing a topic where a number of Debian packages are (or would like to) produce results that are neither the .deb
files, the build logs nor the logs of CI tests. This is an issue for reproducible builds as this “4th type” of build artifact are typically shipped within the binary .deb
packages, and are invariably non-deterministic; thus making the .deb
files unreproducible. (A direct .webm
link and HTML slides are available).
Intermediate Autotools build artifacts removed from PostgreSQL distribution tarballs
Peter Eisentraut wrote a detailed blog post on the subject of “The new PostgreSQL 17 make dist
”. Like many projects, the PostgreSQL database has previously pre-built parts of its GNU Autotools build system: “the reason for this is a mix of convenience and traditional practice”. Peter astutely notes that this arrangement in the build system is “quite tricky” as:
You need to carefully maintain the different states of “clean source code”, “partially built source code”, and “fully built source code”, and the commands to transition between them.
However, Peter goes on to mention that:
… a lot more attention is nowadays paid to the software supply chain. There are security and legal reasons for this. When users install software, they want to know where it came from, and they want to be sure that they got the right thing, not some fake version or some version of dubious legal provenance.
And cites the XZ Utils backdoor as a reason to care about transparent and reproducible ways of distributing and communicating a source tarball and provenance. Because of this, intermediate build artifacts are now henceforth essentially disallowed from PostgreSQL distribution tarballs.
Distribution news
In Debian this month, 30 reviews of Debian packages were added, 17 were updated and 10 were removed this month adding to our knowledge about identified issues. One issue type was added by Chris Lamb, too. […]
In addition, an issue was filed to update the Salsa CI pipeline (used by 1,000s of Debian packages) to no longer test for reproducibility with reprotest’s build_path
variation. Holger Levsen provided a rationale for this change in the issue, which has already been made to the tests being performed by tests.reproducible-builds.org.
In Arch Linux this month, Jelle van der Waa published a short blog post on the topic of Investigating creating reproducible images with mkosi, motivated by the desire to make it possible for anyone to “re-recreate the official Arch cloud image bit-by-bit identical on their own machine as per [the] reproducible builds definition.” In addition, Jelle filed a patch for pacman, the Arch Linux package manager, to respect the SOURCE_DATE_EPOCH
environment variable when installing a package.
In openSUSE news, Bernhard M. Wiedemann published another report for that distribution.
In Android news, the IzzyOnDroid project added 49 new rebuilder recipes and now features 256 total reproducible applications representing 21% of the total offerings in the repository. IzzyOnDroid is “an F-Droid style repository for Android apps[:] applications in this repository are official binaries built by the original application developers, taken from their resp. repositories (mostly GitHub).”
Mailing list news
From our mailing list this month:
-
Bernhard M. Wiedemann posted a brief message to the list with some helpful information regarding nondeterminism within Rust binaries, positing the use of the
codegen-units = 16
default and resulting in a bug being filed in the Rust issue tracker. […] -
Bernhard also wrote to the list, following up to a thread in November 2023, on attempts to make the LibreOffice suite of office applications build reproducibly. In the thread from this month, Bernhard could announce that the four patches previously mentioned have landed in LibreOffice upstream.
-
Fay Stegerman linked the mailing list to a thread she made on the Signal issue tracker regarding whether “device-specific binaries [can] ever be considered meaningfully reproducible”. In particular: “the whole part about ‘allow[ing] multiple third parties to come to a consensus on a “correct” result’ breaks down completely when ‘correct’ is device-specific and not something everyone can agree on.” […]
-
Developer kpcyrd posted an update for source code indexing project, whatsrc.org. Announcing that it now importing packages from live-bootstrap (“a usable Linux system [that is] created with only human-auditable, and wherever possible, human-written, source code”) into its database of provenance data.
-
Lastly, Mechtilde Stehmann posted an update to an earlier thread about how Java builds are not reproducible on the
armhf
architecture, enquiring how they might gain temporary access to such a machine in order to perform some deeper testing. […]
diffoscope
diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb released versions 274
, 275
, 276
and 277
, uploaded these to Debian, and made the following changes as well:
-
New features:
- Strip ANSI escapes—usually colour codes—from the output of the Procyon Java decompiler. […]
- Factor out a method for stripping ANSI escapes. […]
- Append output from
dumppdf(1)
in more cases, avoiding situations where we fallback to a binary diff. […] - Add support for versions of Perl’s IO::Compress::Zip version 2.212. […]
-
Bug fixes:
- Also catch
RuntimeError
exceptions when importing the PyPDF library so that it, or, crucially, its transitive dependencies, cannot not cause diffoscope to traceback at runtime and build time. […] - Do not call
marshal.load(…)
of precompiled Python bytecode as it, alas, inherently unsafe. Replace for now with a brief summary of the code section of.pyc
. […][…] - Don’t include excessive debug output when calling
dumppdf(1)
. […]
- Also catch
-
Testsuite-related changes:
- Don’t bother to check version number in
test_python.py
: the fixture for this test is fixed. […][…] - Update
test_zip
text fixtures and definitions to support new changes to the Perl IO::Compress library. […]
- Don’t bother to check version number in
In addition, Mattia Rizzolo updated the available architectures for a number of test dependencies […] and Sergei Trofimovich fixed an issue to avoid diffoscope crashing when hashing directory symlinks […] and Vagrant Cascadian proposed GNU Guix updates for diffoscope versions [275 and 276 and [277.
Website updates
There were a rather substantial number of improvements made to our website this month, including:
-
Alba Herrerias:
- Substantially extend the guidance on the Contribute page. […]
-
Chris Lamb:
-
Fay Stegerman:
- Add IzzyOnDroid (IoD) to the Projects page. […]
-
hulkoba:
- Considerably overhaul the History page in the documentation, linking strip-nondeterminism and
SOURCE_DATE_EPOCH
[…], fixing the test statistics link […], adjusting the Google Summer of Code application link […], a link to a Debian bug […], and removed a dead link to the debhelper utility […]. - Use the
jekyll-sitemap
plugin to create a sitemap for the website. […] - Use raw HTML to avoid a literal
{ .lead }
directive appearing in the page. […] - Fix a number of issues on the Virtual machine drivers page, such as keeping the Gitian info, linking (and then removing) an issue on the Bitcoin issue tracker […] and fixing a link to the Bazel website […].
- Address a broken footnote link on the Timestamps page. […]
- Unify the style on the Commandments of Reproducible Builds page in order to match other documentation entries. […]
- Add a table of contents to the main Documentation page. […]
- Avoid a number of so-called “here” links on the Variations in the build environment page. […]
- Fix a link to the
man2html
patch on theSOURCE_DATE_EPOCH
documentation page. […] - Fix a link to sources.debian.org on the Randomness page. […]
- Considerably overhaul the History page in the documentation, linking strip-nondeterminism and
-
kpcyrd:
- Fix a typo on the Variations in the build environment page. […]
-
Mattia Rizzolo:
-
Pol Dellaiera:
- Fix the DoI for their thesis on the Publications page. […]
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
-
Bernhard M. Wiedemann:
agama-integration-tests
(uses a random TCP-port number in.lock
file)ca-certificates-mozilla:ca-certificates-mozilla-prebuilt
cosmic
(hash order issue)openSUSE
(meta-issue to test reproducibility in the openSUSE Build Service)pop-launcher
(parallelism-related issue)post
(toolchain-issue, avoiding Rust parallelism)rpm-config-SUSE
(date-related issue)rust
(Rust toolchain issue)weblate
(build gets stuck)
-
Chris Lamb:
-
James Addison:
- #1064782 forwarded and merged in
bind9-doc
- #1066083 forwarded and merged in
gnome-maps
- #1064782 forwarded and merged in
Reproducibility testing framework
The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In August, a number of changes were made by Holger Levsen, including:
- Temporarily install the
openssl-provider-legacy
package for the Debian unstable environments for running diffoscope due to Debian bug #1078944. […][…][…][…] - Mark Debian
armhf
architecture nodes as being down due to proxy down. […][…] - Detect proxy failures. […][…][…]
- Run the
index-buildinfo
for the builtin-pho script with the-q
switch. […] - Disable all Arch Linux reproducible jobs. […]
In addition, Mattia Rizzolo updated the website configuration to install the ruby-jekyll-sitemap
package as it is now used in the website […], Roland Clobus updated the script to build Debian ‘live’ images to treat openQA issues as warnings […], and Vagrant Cascadian marked the cbxi4b
node as down […].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
-
IRC:
#reproducible-builds
onirc.oftc.net
. -
Mastodon: @reproducible_builds@fosstodon.org
-
Mailing list:
rb-general@lists.reproducible-builds.org
-
Twitter: @ReproBuilds