Reproducible Builds in November 2021

View all our monthly reports


Welcome to the November 2021 report from the Reproducible Builds project.

As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. The motivation behind the reproducible builds effort is therefore to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. If you are interested in contributing to our project, please visit our Contribute page on our website.


On November 6th, Vagrant Cascadian presented at this year’s edition of the SeaGL conference, giving a talk titled Debugging Reproducible Builds One Day at a Time:

I’ll explore how I go about identifying issues to work on, learn more about the specific issues, recreate the problem locally, isolate the potential causes, dissect the problem into identifiable parts, and adapt the packaging and/or source code to fix the issues.

A video recording of the talk is available on archive.org.


Fedora Magazine published a post written by Zbigniew Jędrzejewski-Szmek about how to Use Diffoscope in packager workflows, specifically around ensuring that new versions of a package do not introduce breaking changes:

In the role of a packager, updating packages is a recurring task. For some projects, a packager is involved in upstream maintenance, or well written release notes make it easy to figure out what changed between the releases. This isn’t always the case, for instance with some small project maintained by one or two people somewhere on GitHub, and it can be useful to verify what exactly changed. Diffoscope can help determine the changes between package releases. []


kpcyrd announced the release of rebuilderd version 0.16.3 on our mailing list this month, adding support for builds to generate multiple artifacts at once.


Lastly, we held another IRC meeting on November 30th. As mentioned in previous reports, due to the global events throughout 2020 etc. there will be no in-person summit event this year.


diffoscope

diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb made the following changes, including preparing and uploading versions 190, 191, 192, 193 and 194 to Debian:

  • New features:

    • Continue loading a .changes file even if the referenced files do not exist, but include a comment in the returned diff. []
    • Log the reason if we cannot load a Debian .changes file. []
  • Bug fixes:

    • Detect XML files as XML files if file(1) claims if they are XML files or if they are named .xml. (#999438)
    • Don’t duplicate file lists at each directory level. (#989192)
    • Don’t raise a traceback when comparing nested directories with non-directories. []
    • Re-enable test_android_manifest. []
    • Don’t reject Debian .changes files if they contain non-printable characters. []
  • Codebase improvements:

    • Avoid aliasing variables if we aren’t going to use them. []
    • Use isinstance over type. []
    • Drop a number of unused imports. []
    • Update a bunch of %-style string interpolations into f-strings or str.format. []
    • When pretty-printing JSON, mark the difference as being reformatted, additionally avoiding including the full path. []
    • Import itertools top-level module directly. []

Chris Lamb also made an update to the command-line client to trydiffoscope, a web-based version of the diffoscope in-depth and content-aware diff utility, specifically only waiting for 2 minutes for try.diffoscope.org to respond in tests. (#998360)

In addition Brandon Maier corrected an issue where parts of large diffs were missing from the output [], Zbigniew Jędrzejewski-Szmek fixed some logic in the assert_diff_startswith method [] and Mattia Rizzolo updated the packaging metadata to denote that we support both Python 3.9 and 3.10 [] as well as a number of warning-related changes[][]. Vagrant Cascadian also updated the diffoscope package in GNU Guix [][].


Distribution work

In Debian, Roland Clobus updated the wiki page documenting Debian reproducible ‘Live’ images to mention some new bug reports and also posted an in-depth status update to our mailing list.

In addition, 90 reviews of Debian packages were added, 18 were updated and 23 were removed this month adding to our knowledge about identified issues. Chris Lamb identified a new toolchain issue, `absolute_path_in_cmake_file_generated_by_meson.


Work has begun on classifying reproducibility issues in packages within the Arch Linux distribution. Similar to the analogous effort within Debian (outlined above), package information is listed in a human-readable packages.yml YAML file and a sibling README.md file shows how to classify packages too.

Finally, Bernhard M. Wiedemann posted his monthly reproducible builds status report for openSUSE and Vagrant Cascadian updated a link on our website to link to the GNU Guix reproducibility testing overview [].


Software development

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Elsewhere, in software development, Jonas Witschel updated strip-nondeterminism, our tool to remove specific non-deterministic results from a completed build so that it did not fail on JAR archives containing invalid members with a .jar extension []. This change was later uploaded to Debian by Chris Lamb.

reprotest is the Reproducible Build’s project end-user tool to build the same source code twice in widely different environments and checking whether the binaries produced by the builds have any differences. This month, Mattia Rizzolo overhauled the Debian packaging [][][] and fixed a bug surrounding suffixes in the Debian package version [], whilst Stefano Rivera fixed an issue where the package tests were broken after the removal of diffoscope from the package’s strict dependencies [].


Testing framework

The Reproducible Builds project runs a testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:

  • Holger Levsen:

    • Document the progress in setting up snapshot.reproducible-builds.org. []
    • Add the packages required for debian-snapshot. []
    • Make the dstat package available on all Debian based systems. []
    • Mark virt32b-armhf and virt64b-armhf as down. []
  • Jochen Sprickerhof:

    • Add SSH authentication key and enable access to the osuosl168-amd64 node. [][]
  • Mattia Rizzolo:

    • Revert “reproducible Debian: mark virt(32 64)b-armhf as down” - restored. []
  • Roland Clobus (Debian “live” image generation):

    • Rename sid internally to unstable until an issue in the snapshot system is resolved. []
    • Extend testing to include Debian bookworm too.. []
    • Automatically create the Jenkins ‘view’ to display jobs related to building the Live images. []
  • Vagrant Cascadian:

    • Add a Debian ‘package set’ group for the packages and tools maintained by the Reproducible Builds maintainers themselves. []



If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:




View all our monthly reports