Reproducible Builds in November 2022

View all our monthly reports


Welcome to yet another report from the Reproducible Builds project, this time for November 2022. In all of these reports (which we have been publishing regularly since May 2015) we attempt to outline the most important things that we have been up to over the past month. As always, if you interested in contributing to the project, please visit our Contribute page on our website.


Reproducible Builds Summit 2022

Following-up from last month’s report about our recent summit in Venice, Italy, a comprehensive report from the meeting has not been finalised yet — watch this space!

As a very small preview, however, we can link to several issues that were filed about the website during the summit (#38, #39, #40, #41, #42, #43, etc.) and collectively learned about Software Bill of Materials (SBOM)’s and how .buildinfo files can be seen/used as SBOMs. And, no less importantly, the Reproducible Builds t-shirt design has been updated…


Reproducible Builds at European Cyber Week 2022

During the European Cyber Week 2022, a Capture The Flag (CTF) cybersecurity challenge was created by Frédéric Pierret on the subject of Reproducible Builds. The challenge consisted in a pedagogical sense based on how to make a software release reproducible. To progress through the challenge issues that affect the reproducibility of build (such as build path, timestamps, file ordering, etc.) were to be fixed in steps in order to get the final ‘flag’ in order to win the challenge.

At the end of the competition, five people succeeded in solving the challenge, all of whom were awarded with a shirt. Frédéric Pierret intends to create similar challenge in the form of a “how to” in the Reproducible Builds documentation, but two of the 2022 winners are shown here:


On business adoption and use of reproducible builds…

Simon Butler announced on the rb-general mailing list that the Software Quality Journal published an article called On business adoption and use of reproducible builds for open and closed source software.

This article is an interview-based study which focuses on the adoption and uses of Reproducible Builds in industry, with a focus on investigating the reasons why organisations might not have adopted them:

[…] industry application of R-Bs appears limited, and we seek to understand whether awareness is low or if significant technical and business reasons prevent wider adoption.

This is achieved through interviews with software practitioners and business managers, and touches on both the business and technical reasons supporting the adoption (or not) of Reproducible Builds. The article also begins with an excellent explanation and literature review, and even introduces a new helpful analogy for reproducible builds:

[Users are] able to perform a bitwise comparison of the two binaries to verify that they are identical and that the distributed binary is indeed built from the source code in the way the provider claims. Applied in this manner, R-Bs function as a canary, a mechanism that indicates when something might be wrong, and offer an improvement in security over running unverified binaries on computer systems.

The full paper is available to download on an ‘open access’ basis.

Elsewhere in academia, Beatriz Michelson Reichert and Rafael R. Obelheiro have published a paper proposing a systematic threat model for a generic software development pipeline identifying possible mitigations for each threat (PDF). Under the Tampering rubric of their paper, various attacks against Continuous Integration (CI) processes:

An attacker may insert a backdoor into a CI or build tool and thus introduce vulnerabilities into the software (resulting in an improper build). To avoid this threat, it is the developer’s responsibility to take due care when making use of third-party build tools. Tampered compilers can be mitigated using diversity, as in the diverse double compiling (DDC) technique. Reproducible builds, a recent research topic, can also provide mitigation for this problem. (PDF)


Misc news


On our mailing list this month:

  • Adrian Diglio from Microsoft asked “How to Add a New Project within Reproducible Builds” which solicited a number of replies.

  • Vagrant Cascadian posed an interesting question regarding the difference between “test builds” vs “rebuilds” (or “verification rebuilds”). As Vagrant poses in their message, “they’re both useful for slightly different purposes, and it might be good to clarify the distinction […].”


Debian & other Linux distributions

Over 50 reviews of Debian packages were added this month, another 48 were updated and almost 30 were removed, all of which adds to our knowledge about identified issues. Two new issue types were added as well. [][].

Vagrant Cascadian announced on our mailing list another online sprint to help ‘clear the huge backlog of reproducible builds patches submitted’ by performing NMUs (Non-Maintainer Uploads). The first such sprint took place on September 22nd, but others were held on October 6th and October 20th. There were two additional sprints that occurred in November, however, which resulted in the following progress:

Lastly, Roland Clobus posted his latest update of the status of reproducible Debian ISO images on our mailing list. This reports that ‘all major desktops build reproducibly with bullseye, bookworm and sid’ as well as that no custom patches needed to applied to Debian unstable for this result to occur. During November, however, Roland proposed some modifications to live-setup and the rebuild script has been adjusted to fix the failing Jenkins tests for Debian bullseye [][].


In other news, Miro Hrončok proposed a change to ‘clamp’ build modification times to the value of SOURCE_DATE_EPOCH. This was initially suggested and discussed on a devel@ mailing list post but was later written up on the Fedora Wiki as well as being officially proposed to Fedora Engineering Steering Committee (FESCo).


Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:


diffoscope

diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 226 and 227 to Debian:

  • Support both python3-progressbar and python3-progressbar2, two modules providing the progressbar Python module. []
  • Don’t run Python decompiling tests on Python bytecode that file(1) cannot detect yet and Python 3.11 cannot unmarshal. (#1024335)
  • Don’t attempt to attach text-only differences notice if there are no differences to begin with. (#1024171)
  • Make sure we recommend apksigcopier. []
  • Tidy generation of os_list. []
  • Make the code clearer around generating the Debian ‘substvars’. []
  • Use our assert_diff helper in test_lzip.py. []
  • Drop other copyright notices from lzip.py and test_lzip.py. []

In addition to this, Christopher Baines added lzip support [], and FC Stegerman added an optimisation whereby we don’t run apktool if no differences are detected before the signing block [].


A significant number of changes were made to the Reproducible Builds website and documentation this month, including Chris Lamb ensuring the openEuler logo is correctly visible with a white background [], FC Stegerman de-duplicated by email address to avoid listing some contributors twice [], Hervé Boutemy added Apache Maven to the list of affiliated projects [] and boyska updated our Contribute page to remark that the Reproducible Builds presence on salsa.debian.org is not just the Git repository but is also for creating issues [][]. In addition to all this, however, Holger Levsen made the following changes:

  • Add a number of existing publications [][] and update metadata for some existing publications as well [].
  • Hide draft posts on the website homepage. []
  • Add the Warpforge build tool as a participating project of the summit. []
  • Clarify in the footer that we welcome patches to the website repository. []

Testing framework

The Reproducible Builds project operates a comprehensive testing framework at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In October, the following changes were made by Holger Levsen:

  • Improve the generation of ‘meta’ package sets (used in grouping packages for reporting/statistical purposes) to treat Debian bookworm as equivalent to Debian unstable in this specific case [] and to parse the list of packages used in the Debian cloud images [][][].
  • Temporarily allow Frederic to ssh(1) into our snapshot server as the jenkins user. []
  • Keep some reproducible jobs Jenkins logs much longer [] (later reverted).
  • Improve the node health checks to detect failures to update the Debian cloud image package set [][] and to improve prioritisation of some kernel warnings [].
  • Always echo any IRC output to Jenkins’ output as well. []
  • Deal gracefully with problems related to processing the cloud image package set. []

Finally, Roland Clobus continued his work on testing Live Debian images, including adding support for specifying the origin of the Debian installer [] and to warn when the image has unmet dependencies in the package list (e.g. due to a transition) [].


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. You can get in touch with us via:




View all our monthly reports