Reproducible Builds in February 2024

View all our monthly reports


Welcome to the February 2024 report from the Reproducible Builds project! In our reports, we try to outline what we have been up to over the past month as well as mentioning some of the important things happening in software supply-chain security.


Reproducible Builds at FOSDEM 2024

Core Reproducible Builds developer Holger Levsen presented at the main track at FOSDEM on Saturday 3rd February this year in Brussels, Belgium. However, that wasn’t the only talk related to Reproducible Builds.

However, please see our comprehensive FOSDEM 2024 news post for the full details and links.


Maintainer Perspectives on Open Source Software Security

Bernhard M. Wiedemann spotted that a recent report entitled Maintainer Perspectives on Open Source Software Security written by Stephen Hendrick and Ashwin Ramaswami of the Linux Foundation sports an infographic which mentions that “56% of [polled] projects support reproducible builds”.


A total of three separate scholarly papers related to Reproducible Builds have appeared this month:

Signing in Four Public Software Package Registries: Quantity, Quality, and Influencing Factors by Taylor R. Schorlemmer, Kelechi G. Kalu, Luke Chigges, Kyung Myung Ko, Eman Abdul-Muhd, Abu Ishgair, Saurabh Bagchi, Santiago Torres-Arias and James C. Davis (Purdue University, Indiana, USA) is concerned with the problem that:

Package maintainers can guarantee package authorship through software signing [but] it is unclear how common this practice is, and whether the resulting signatures are created properly. Prior work has provided raw data on signing practices, but measured single platforms, did not consider time, and did not provide insight on factors that may influence signing. We lack a comprehensive, multi-platform understanding of signing adoption and relevant factors. This study addresses this gap. (arXiv, full PDF)


Reproducibility of Build Environments through Space and Time by Julien Malka, Stefano Zacchiroli and Théo Zimmermann (Institut Polytechnique de Paris, France) addresses:

[The] principle of reusability […] makes it harder to reproduce projects’ build environments, even though reproducibility of build environments is essential for collaboration, maintenance and component lifetime. In this work, we argue that functional package managers provide the tooling to make build environments reproducible in space and time, and we produce a preliminary evaluation to justify this claim.

The abstract continues with the claim that “Using historical data, we show that we are able to reproduce build environments of about 7 million Nix packages, and to rebuild 99.94% of the 14 thousand packages from a 6-year-old Nixpkgs revision. (arXiv, full PDF)


Options Matter: Documenting and Fixing Non-Reproducible Builds in Highly-Configurable Systems by Georges Aaron Randrianaina, Djamel Eddine Khelladi, Olivier Zendra and Mathieu Acher (Inria centre at Rennes University, France):

This paper thus proposes an approach to automatically identify configuration options causing non-reproducibility of builds. It begins by building a set of builds in order to detect non-reproducible ones through binary comparison. We then develop automated techniques that combine statistical learning with symbolic reasoning to analyze over 20,000 configuration options. Our methods are designed to both detect options causing non-reproducibility, and remedy non-reproducible configurations, two tasks that are challenging and costly to perform manually. (HAL Portal, full PDF)


Mailing list highlights

From our mailing list this month:


Distribution work

In Debian this month, 5 reviews of Debian packages were added, 22 were updated and 8 were removed this month adding to Debian’s knowledge about identified issues. A number of issue types were updated as well. […][…][…][…] In addition, Roland Clobus posted his 23rd update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that “all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours)” and that there will likely be further work at a MiniDebCamp in Hamburg. Furthermore, Roland also responded in-depth to a query about a previous report


Fedora developer Zbigniew Jędrzejewski-Szmek announced a work-in-progress script called fedora-repro-build that attempts to reproduce an existing package within a koji build environment. Although the projects’ README file lists a number of “fields will always or almost always vary” and there is a non-zero list of other known issues, this is an excellent first step towards full Fedora reproducibility.


Jelle van der Waa introduced a new linter rule for Arch Linux packages in order to detect cache files leftover by the Sphinx documentation generator which are unreproducible by nature and should not be packaged. At the time of writing, 7 packages in the Arch repository are affected by this.


Elsewhere, Bernhard M. Wiedemann posted another monthly update for his work elsewhere in openSUSE.


diffoscope

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 256, 257 and 258 to Debian and made the following additional changes:

  • Use a deterministic name instead of trusting gpg’s –use-embedded-filenames. Many thanks to Daniel Kahn Gillmor dkg@debian.org for reporting this issue and providing feedback. [][]
  • Don’t error-out with a traceback if we encounter struct.unpack-related errors when parsing Python .pyc files. (#1064973). []
  • Don’t try and compare rdb_expected_diff on non-GNU systems as %p formatting can vary, especially with respect to MacOS. []
  • Fix compatibility with pytest 8.0. []
  • Temporarily fix support for Python 3.11.8. []
  • Use the 7zip package (over p7zip-full) after a Debian package transition. (#1063559). []
  • Bump the minimum Black source code reformatter requirement to 24.1.1+. []
  • Expand an older changelog entry with a CVE reference. []
  • Make test_zip black clean. []

In addition, James Addison contributed a patch to parse the headers from the diff(1) correctly [][] — thanks! And lastly, Vagrant Cascadian pushed updates in GNU Guix for diffoscope to version 255, 256, and 258, and updated trydiffoscope to 67.0.6.


reprotest

reprotest is our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. This month, Vagrant Cascadian made a number of changes, including:

  • Create a (working) proof of concept for enabling a specific number of CPUs. [][]
  • Consistently use 398 days for time variation rather than choosing randomly and update README.rst to match. [][]
  • Support a new --vary=build_path.path option. [][][][]


Website updates

There were made a number of improvements to our website this month, including:


Reproducibility testing framework

The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In February, a number of changes were made by Holger Levsen:

  • Debian-related changes:

    • Temporarily disable upgrading/bootstrapping Debian unstable and experimental as they are currently broken. [][]
    • Use the 64-bit amd64 kernel on all i386 nodes; no more 686 PAE kernels. []
    • Add an Erlang package set. []
  • Other changes:

    • Grant Jan-Benedict Glaw shell access to the Jenkins node. []
    • Enable debugging for NetBSD reproducibility testing. []
    • Use /usr/bin/du --apparent-size in the Jenkins shell monitor. []
    • Revert “reproducible nodes: mark osuosl2 as down”. []
    • Thanks again to Codethink, for they have doubled the RAM on our arm64 nodes. []
    • Only set /proc/$pid/oom_score_adj to -1000 if it has not already been done. []
    • Add the opemwrt-target-tegra and jtx task to the list of zombie jobs. [][]

Vagrant Cascadian also made the following changes:

  • Overhaul the handling of OpenSSH configuration files after updating from Debian bookworm. [][][]
  • Add two new armhf architecture build nodes, virt32z and virt64z, and insert them into the Munin monitoring. [][] [][]

In addition, Alexander Couzens updated the OpenWrt configuration in order to replace the tegra target with mpc85xx [], Jan-Benedict Glaw updated the NetBSD build script to use a separate $TMPDIR to mitigate out of space issues on a tmpfs-backed /tmp [] and Zheng Junjie added a link to the GNU Guix tests [].

Lastly, node maintenance was performed by Holger Levsen [][][][][][] and Vagrant Cascadian [][][][].


Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:



If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:




View all our monthly reports