Reproducible Builds in December 2023

View all our monthly reports


Welcome to the December 2023 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a rather rapid recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries (more).


Reproducible Builds: Increasing the Integrity of Software Supply Chains awarded IEEE Software “Best Paper” award

In February 2022, we announced in these reports that a paper written by Chris Lamb and Stefano Zacchiroli was now available in the March/April 2022 issue of IEEE Software. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains (PDF).

This month, however, IEEE Software announced that this paper has won their Best Paper award for 2022.


Reproducibility to affect package migration policy in Debian

In a post summarising the activities of the Debian Release Team at a recent in-person Debian event in Cambridge, UK, Paul Gevers announced a change to the way packages are “migrated” into the staging area for the next stable Debian release based on its reproducibility status:

The folks from the Reproducibility Project have come a long way since they started working on it 10 years ago, and we believe it’s time for the next step in Debian. Several weeks ago, we enabled a migration policy in our migration software that checks for regression in reproducibility. At this moment, that is presented as just for info, but we intend to change that to delays in the not so distant future. We eventually want all packages to be reproducible. To stimulate maintainers to make their packages reproducible now, we’ll soon start to apply a bounty [speedup] for reproducible builds, like we’ve done with passing autopkgtests for years. We’ll reduce the bounty for successful autopkgtests at that moment in time.


Speranza: “Usable, privacy-friendly software signing”

Kelsey Merrill, Karen Sollins, Santiago Torres-Arias and Zachary Newman have developed a new system called Speranza, which is aimed at reassuring software consumers that the product they are getting has not been tampered with and is coming directly from a source they trust. A write-up on TechXplore.com goes into some more details:

“What we have done,” explains Sollins, “is to develop, prove correct, and demonstrate the viability of an approach that allows the [software] maintainers to remain anonymous.” Preserving anonymity is obviously important, given that almost everyone—software developers included—value their confidentiality. This new approach, Sollins adds, “simultaneously allows [software] users to have confidence that the maintainers are, in fact, legitimate maintainers and, furthermore, that the code being downloaded is, in fact, the correct code of that maintainer.” []

The corresponding paper is published on the arXiv preprint server in various formats, and the announcement has also been covered in MIT News.


Nondeterministic Git bundles

Paul Baecher published an interesting blog post on Reproducible git bundles. For those who are not familiar with them, Git bundles are used for the “offline” transfer of Git objects without an active server sitting on the other side of a network connection. Anyway, Paul wrote about writing a backup system for his entire system, but:

I noticed that a small but fixed subset of [Git] repositories are getting backed up despite having no changes made. That is odd because I would think that repeated bundling of the same repository state should create the exact same bundle. However [it] turns out that for some, repositories bundling is nondeterministic.

Paul goes on to to describe his solution, which involves “forcing git to be single threaded makes the output deterministic”. The article was also discussed on Hacker News.


Output from libxlst now deterministic

libxslt is the XSLT C library developed for the GNOME project, where XSLT itself is an XML language to define transformations for XML files. This month, it was revealed that the result of the generate-id() XSLT function is now deterministic across multiple transformations, fixing many issues with reproducible builds. As the Git commit by Nick Wellnhofer describes:

Rework the generate-id() function to return deterministic values. We use
a simple incrementing counter and store ids in the 'psvi' member of
nodes which was freed up by previous commits. The presence of an id is
indicated by a new "source node" flag.

This fixes long-standing problems with reproducible builds, see
https://bugzilla.gnome.org/show_bug.cgi?id=751621

This also hardens security, as the old implementation leaked the
difference between a heap and a global pointer, see
https://bugs.chromium.org/p/chromium/issues/detail?id=1356211

The old implementation could also generate the same id for dynamically
created nodes which happened to reuse the same memory. Ids for namespace
nodes were completely broken. They now use the id of the parent element
together with the hex-encoded namespace prefix.


Community updates

There were made a number of improvements to our website, including Chris Lamb fixing the generate-draft script to not blow up if the input files have been corrupted today or even in the past [], Holger Levsen updated the Hamburg 2023 summit to add a link to farewell post [] & to add a picture of a Post-It note. [], and Pol Dellaiera updated the paragraph about tar and the --clamp-mtime flag [].

On our mailing list this month, Bernhard M. Wiedemann posted an interesting summary on some of the reasons why packages are still not reproducible in 2023.

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes, including processing objdump symbol comment filter inputs as Python byte (and not str) instances [] and Vagrant Cascadian extended diffoscope support for GNU Guix [] and updated the version in that distribution to version 253 [].


“Challenges of Producing Software Bill Of Materials for Java”

Musard Balliu, Benoit Baudry, Sofia Bobadilla, Mathias Ekstedt, Martin Monperrus, Javier Ron, Aman Sharma, Gabriel Skoglund, César Soto-Valero and Martin Wittlinger (!) of the KTH Royal Institute of Technology in Sweden, have published an article in which they:

… deep-dive into 6 tools and the accuracy of the SBOMs they produce for complex open-source Java projects. Our novel insights reveal some hard challenges regarding the accurate production and usage of software bills of materials.

The paper is available on arXiv.


Debian Non-Maintainer campaign

As mentioned in previous reports, the Reproducible Builds team within Debian has been organising a series of online and offline sprints in order to clear the huge backlog of reproducible builds patches submitted by performing so-called NMUs (Non-Maintainer Uploads).

During December, Vagrant Cascadian performed a number of such uploads, including:

In addition, Holger Levsen performed three “no-source-change” NMUs in order to address the last packages without .buildinfo files in Debian trixie, specifically lorene (0.0.0~cvs20161116+dfsg-1.1), maria (1.3.5-4.2) and ruby-rinku (1.7.3-2.1).


Reproducibility testing framework

The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In December, a number of changes were made by Holger Levsen:

  • Debian-related changes:

    • Fix matching packages for the [R programming language](https://en.wikipedia.org/wiki/R_(programming_language). [][][]
    • Add a Certbot configuration for the Nginx web server. []
    • Enable debugging for the create-meta-pkgs tool. [][]
  • Arch Linux-related changes

    • The asp has been deprecated by pkgctl; thanks to dvzrv for the pointer. []
    • Disable the Arch Linux builders for now. []
    • Stop referring to the /trunk branch / subdirectory. []
    • Use --protocol https when cloning repositories using the pkgctl tool. []
  • Misc changes:

In addition, node maintenance was performed by Holger Levsen [] and Vagrant Cascadian [].


Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:



If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:




View all our monthly reports