Reproducible Builds in August 2024

View all our monthly reports


Welcome to the August 2024 report from the Reproducible Builds project!

Our reports attempt to outline what we’ve been up to over the past month, highlighting news items from elsewhere in tech where they are related. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.

Table of contents:

  1. LWN: The history, status, and plans for reproducible builds
  2. Intermediate Autotools build artifacts removed from PostgreSQL distribution tarballs
  3. Distribution news
  4. Mailing list news
  5. diffoscope
  6. Website updates
  7. Upstream patches
  8. Reproducibility testing framework

LWN: The history, status, and plans for reproducible builds

The free software newspaper of record, Linux Weekly News, published an in-depth article based on Holger Levsen’s talk, Reproducible Builds: The First Eleven Years which was presented at the recent DebConf24 conference in Busan, South Korea.

Titled The history, status, and plans for reproducible builds and written by Jake Edge, LWN’s article not only summarises Holger’s talk and clarifies its message but it links to external information as well. Holger’s original talk can also be watched on the DebConf24 webpage (direct .webm link and his HTML slides are available also). There are also a significant number of comments on LWN’s page as well.

Holger Levsen also headed a scheduled discussion session at DebConf24 on Preserving *other* build artifacts addressing a topic where a number of Debian packages are (or would like to) produce results that are neither the .deb files, the build logs nor the logs of CI tests. This is an issue for reproducible builds as this “4th type” of build artifact are typically shipped within the binary .deb packages, and are invariably non-deterministic; thus making the .deb files unreproducible. (A direct .webm link and HTML slides are available).


Intermediate Autotools build artifacts removed from PostgreSQL distribution tarballs

Peter Eisentraut wrote a detailed blog post on the subject of “The new PostgreSQL 17 make dist”. Like many projects, the PostgreSQL database has previously pre-built parts of its GNU Autotools build system: “the reason for this is a mix of convenience and traditional practice”. Peter astutely notes that this arrangement in the build system is “quite tricky” as:

You need to carefully maintain the different states of “clean source code”, “partially built source code”, and “fully built source code”, and the commands to transition between them.

However, Peter goes on to mention that:

… a lot more attention is nowadays paid to the software supply chain. There are security and legal reasons for this. When users install software, they want to know where it came from, and they want to be sure that they got the right thing, not some fake version or some version of dubious legal provenance.

And cites the XZ Utils backdoor as a reason to care about transparent and reproducible ways of distributing and communicating a source tarball and provenance. Because of this, intermediate build artifacts are now henceforth essentially disallowed from PostgreSQL distribution tarballs.

Distribution news

In Debian this month, 30 reviews of Debian packages were added, 17 were updated and 10 were removed this month adding to our knowledge about identified issues. One issue type was added by Chris Lamb, too. []

In addition, an issue was filed to update the Salsa CI pipeline (used by 1,000s of Debian packages) to no longer test for reproducibility with reprotest’s build_path variation. Holger Levsen provided a rationale for this change in the issue, which has already been made to the tests being performed by tests.reproducible-builds.org.


In Arch Linux this month, Jelle van der Waa published a short blog post on the topic of Investigating creating reproducible images with mkosi, motivated by the desire to make it possible for anyone to “re-recreate the official Arch cloud image bit-by-bit identical on their own machine as per [the] reproducible builds definition.” In addition, Jelle filed a patch for pacman, the Arch Linux package manager, to respect the SOURCE_DATE_EPOCH environment variable when installing a package.


In openSUSE news, Bernhard M. Wiedemann published another report for that distribution.


In Android news, the IzzyOnDroid project added 49 new rebuilder recipes and now features 256 total reproducible applications representing 21% of the total offerings in the repository. IzzyOnDroid is “an F-Droid style repository for Android apps[:] applications in this repository are official binaries built by the original application developers, taken from their resp. repositories (mostly GitHub).”


Mailing list news

From our mailing list this month:

  • Bernhard M. Wiedemann posted a brief message to the list with some helpful information regarding nondeterminism within Rust binaries, positing the use of the codegen-units = 16 default and resulting in a bug being filed in the Rust issue tracker. []

  • Bernhard also wrote to the list, following up to a thread in November 2023, on attempts to make the LibreOffice suite of office applications build reproducibly. In the thread from this month, Bernhard could announce that the four patches previously mentioned have landed in LibreOffice upstream.

  • Fay Stegerman linked the mailing list to a thread she made on the Signal issue tracker regarding whether “device-specific binaries [can] ever be considered meaningfully reproducible”. In particular: “the whole part about ‘allow[ing] multiple third parties to come to a consensus on a “correct” result’ breaks down completely when ‘correct’ is device-specific and not something everyone can agree on.” []

  • Developer kpcyrd posted an update for source code indexing project, whatsrc.org. Announcing that it now importing packages from live-bootstrap (“a usable Linux system [that is] created with only human-auditable, and wherever possible, human-written, source code”) into its database of provenance data.

  • Lastly, Mechtilde Stehmann posted an update to an earlier thread about how Java builds are not reproducible on the armhf architecture, enquiring how they might gain temporary access to such a machine in order to perform some deeper testing. []


diffoscope

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb released versions 274, 275, 276 and 277, uploaded these to Debian, and made the following changes as well:

  • New features:

    • Strip ANSI escapes—usually colour codes—from the output of the Procyon Java decompiler. []
    • Factor out a method for stripping ANSI escapes. []
    • Append output from dumppdf(1) in more cases, avoiding situations where we fallback to a binary diff. []
    • Add support for versions of Perl’s IO::Compress::Zip version 2.212. []
  • Bug fixes:

    • Also catch RuntimeError exceptions when importing the PyPDF library so that it, or, crucially, its transitive dependencies, cannot not cause diffoscope to traceback at runtime and build time. []
    • Do not call marshal.load(…) of precompiled Python bytecode as it, alas, inherently unsafe. Replace for now with a brief summary of the code section of .pyc. [][]
    • Don’t include excessive debug output when calling dumppdf(1). []
  • Testsuite-related changes:

    • Don’t bother to check version number in test_python.py: the fixture for this test is fixed. [][]
    • Update test_zip text fixtures and definitions to support new changes to the Perl IO::Compress library. []

In addition, Mattia Rizzolo updated the available architectures for a number of test dependencies [] and Sergei Trofimovich fixed an issue to avoid diffoscope crashing when hashing directory symlinks [] and Vagrant Cascadian proposed GNU Guix updates for diffoscope versions 275 and 276 and 277.


Website updates

There were a rather substantial number of improvements made to our website this month, including:


Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:


Reproducibility testing framework

The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In August, a number of changes were made by Holger Levsen, including:

  • Temporarily install the openssl-provider-legacy package for the Debian unstable environments for running diffoscope due to Debian bug #1078944. [][][][]
  • Mark Debian armhf architecture nodes as being down due to proxy down. [][]
  • Detect proxy failures. [][][]
  • Run the index-buildinfo for the builtin-pho script with the -q switch. []
  • Disable all Arch Linux reproducible jobs. []

In addition, Mattia Rizzolo updated the website configuration to install the ruby-jekyll-sitemap package as it is now used in the website [], Roland Clobus updated the script to build Debian ‘live’ images to treat openQA issues as warnings [], and Vagrant Cascadian marked the cbxi4b node as down [].



If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:




View all our monthly reports