Reproducible Builds in November 2022

← View all our monthly reports

Welcome to yet another report from the Reproducible Builds project, this time for November 2022. In all of these reports (which we have been publishing regularly since May 2015) we attempt to outline the most important things that we have been up to over the past month. As always, if you interested in contributing to the project, please visit our Contribute page on our website.

Reproducible Builds Summit 2022

Following-up from last month’s report about our recent summit in Venice, Italy, a comprehensive report from the meeting has not been finalised yet — watch this space!

As a very small preview, however, we can link to several issues that were filed about the website during the summit (#38, #39, #40, #41, #42, #43, etc.) and collectively learned about Software Bill of Materials (SBOM)’s and how .buildinfo files can be seen/used as SBOMs. And, no less importantly, the Reproducible Builds t-shirt design has been updated…

Reproducible Builds at European Cyber Week 2022

During the European Cyber Week 2022, a Capture The Flag (CTF) cybersecurity challenge was created by Frédéric Pierret on the subject of Reproducible Builds. The challenge consisted in a pedagogical sense based on how to make a software release reproducible. To progress through the challenge issues that affect the reproducibility of build (such as build path, timestamps, file ordering, etc.) were to be fixed in steps in order to get the final ‘flag’ in order to win the challenge.

At the end of the competition, five people succeeded in solving the challenge, all of whom were awarded with a shirt. Frédéric Pierret intends to create similar challenge in the form of a “how to” in the Reproducible Builds documentation, but two of the 2022 winners are shown here:

‘On business adoption and use of reproducible builds…’

Simon Butler announced on the rb-general mailing list that the Software Quality Journal published an article called On business adoption and use of reproducible builds for open and closed source software.

This article is an interview-based study which focuses on the adoption and uses of Reproducible Builds in industry, with a focus on investigating the reasons why organisations might not have adopted them:

[…] industry application of R-Bs appears limited, and we seek to understand whether awareness is low or if significant technical and business reasons prevent wider adoption.

This is achieved through interviews with software practitioners and business managers, and touches on both the business and technical reasons supporting the adoption (or not) of Reproducible Builds. The article also begins with an excellent explanation and literature review, and even introduces a new helpful analogy for reproducible builds:

[Users are] able to perform a bitwise comparison of the two binaries to verify that they are identical and that the distributed binary is indeed built from the source code in the way the provider claims. Applied in this manner, R-Bs function as a canary, a mechanism that indicates when something might be wrong, and offer an improvement in security over running unverified binaries on computer systems.

The full paper is available to download on an ‘open access’ basis.

Elsewhere in academia, Beatriz Michelson Reichert and Rafael R. Obelheiro have published a paper proposing a systematic threat model for a generic software development pipeline identifying possible mitigations for each threat (PDF). Under the Tampering rubric of their paper, various attacks against Continuous Integration (CI) processes:

An attacker may insert a backdoor into a CI or build tool and thus introduce vulnerabilities into the software (resulting in an improper build). To avoid this threat, it is the developer’s responsibility to take due care when making use of third-party build tools. Tampered compilers can be mitigated using diversity, as in the diverse double compiling (DDC) technique. Reproducible builds, a recent research topic, can also provide mitigation for this problem. (PDF)

Misc news

A change was proposed for the Go programming language to enable reproducible builds when Link Time Optimisation (LTO) is enabled. As mentioned in the changelog, Morten Linderud’s patch fixes two issues when the linker used in conjunction with the -flto option: the first involves solving an issue related to seeded random numbers; and the second involved the binary embedding the current working directory in compressed sections of the LTO object. Both of these issues made the build unreproducible.
In the .NET framework ecosystem, a wiki page for the Roslyn .NET C# and Visual Basic compiler was uncovered this month that details its attempts to ensure end-to-end reproducible builds by focusing on the definition on what are ‘considered inputs to the compiler for the purpose of determinism’. This is a spiritual followup to a 2016 blog post by Microsoft developer Jared Parsons on ‘Deterministic builds in Roslyn’ which starts: ‘It seems silly to celebrate features which should have been there from the start.’
Ian Lance Taylor followed up an old post to report that Jakub Jelinek’s patch from September 2000 is incomplete.

In F-Droid this month, Reproducible Builds contributor FC Stegerman created a set of ‘reproducible APK tools’ as a workaround for issues like the order of files in APKs built on macOS being non-deterministic. In addition, the new issue documenting the overview of apps using reproducible builds shows that F-Droid added 11 new apps that use reproducible builds, and FC Stegerman released apksigcopier version 1.1.0 which adds support for APKs signed by ‘Signflinger’.
martinSusz has written up a fascinating wiki page describing how to generate ‘quasi-reproducible’ firmware ROMs for System-on-a-Chip (SoC) components fabricated by Rock Chip. These chips are used in popular low-cost laptops such as the Pine64 PinebookPro and Asus C201. The link is worth viewing simply for the interesting diagram.
Our monthly IRC meeting was held on November 29th 2022. Our next meeting will be on January 31st 2023; we’ll skip the meeting in December due to the proximity to Christmas, etc.

On our mailing list this month:

Adrian Diglio from Microsoft asked “How to Add a New Project within Reproducible Builds” which solicited a number of replies.
Vagrant Cascadian posed an interesting question regarding the difference between “test builds” vs “rebuilds” (or “verification rebuilds”). As Vagrant poses in their message, “they’re both useful for slightly different purposes, and it might be good to clarify the distinction […].”

Debian & other Linux distributions

Over 50 reviews of Debian packages were added this month, another 48 were updated and almost 30 were removed, all of which adds to our knowledge about identified issues. Two new issue types were added as well. […][…].

Vagrant Cascadian announced on our mailing list another online sprint to help ‘clear the huge backlog of reproducible builds patches submitted’ by performing NMUs (Non-Maintainer Uploads). The first such sprint took place on September 22nd, but others were held on October 6th and October 20th. There were two additional sprints that occurred in November, however, which resulted in the following progress:

Chris Lamb:
- paxctl (Fixed #1020804)
- png23d (Fixed #1020805)
- tuxcmd-modules (Fixed #1011500 & #941296)
- waili (Fixed #1020751)
- zephyr (Fixed #828867 #1021374)
Vagrant Cascadian:
- ddd (Fixed #834016)
- libpam-ldap (Fixed #834050)
- nsnake (Fixed #833612)
- quvi (Fixed #835259)
- stressapptest (Fixed #831587 & #986653)
- tcpreen (Fixed #831585)
- boolector (Fixed #1023886)
- tsdecrypt (Fixed #829713 & #1022130)
- wbxml2 (QA upload fixed build path issues)
- tercpp (QA upload fixed build path issues)

Lastly, Roland Clobus posted his latest update of the status of reproducible Debian ISO images on our mailing list. This reports that ‘all major desktops build reproducibly with bullseye, bookworm and sid’ as well as that no custom patches needed to applied to Debian unstable for this result to occur. During November, however, Roland proposed some modifications to live-setup and the rebuild script has been adjusted to fix the failing Jenkins tests for Debian bullseye […][…].

In other news, Miro Hrončok proposed a change to ‘clamp’ build modification times to the value of SOURCE_DATE_EPOCH. This was initially suggested and discussed on a devel@ mailing list post but was later written up on the Fedora Wiki as well as being officially proposed to Fedora Engineering Steering Committee (FESCo).

Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Bernhard M. Wiedemann:
- dwz (Profile-guided optimisation issue)
- icmake (filesystem ordering issue)
- llmnrd
- elixir (report a bug re. stuck build on single-core VMs)
- warzone2100 (report a bug re. parallelism-dependent output)
Chris Lamb:
- #1023589 filed against libnvme.
- #1024352 filed against pykafka.
Vagrant Cascadian:
- #1023886 filed against boolector.
- #1023956 filed against fl-cow.
- #1023957 filed against gerstensaft.
- #1023960 filed against libcgicc.
- #1024007 filed against haskell98-report.
- #1024125 filed against ucspi-proxy.
- #1024126 filed against hunt.
- #1024279 filed against tolua++.
- #1024282 filed against twoftpd.
- #1024283 filed against ipsvd.
- #1024284 filed against gentoo.
- #1024286 filed against lcm.
- #1024288 filed against apcupsd.
- #1024289 filed against openfortivpn.
- #1024290 filed against xtb.
- #1024291 filed against gnunet.
- #1024292 filed against swift-im.
- #1024396 filed against brewtarget.
- #1024399 filed against xrprof.
- #1024404 filed against gitlint.
- #1024412 filed against claws-mail.
- #1024413 filed against presage.
- #1024530 filed against jh7100-bootloader-recovery.
Victor Westerhuis:
- #1024482 & #1024638 filed against opencv.
John Neffenger:
- tomcat (Fixed Apache bug #66346)

diffoscope

diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 226 and 227 to Debian:

Support both python3-progressbar and python3-progressbar2, two modules providing the progressbar Python module. […]
Don’t run Python decompiling tests on Python bytecode that file(1) cannot detect yet and Python 3.11 cannot unmarshal. (#1024335)
Don’t attempt to attach text-only differences notice if there are no differences to begin with. (#1024171)
Make sure we recommend apksigcopier. […]
Tidy generation of os_list. […]
Make the code clearer around generating the Debian ‘substvars’. […]
Use our assert_diff helper in test_lzip.py. […]
Drop other copyright notices from lzip.py and test_lzip.py. […]

In addition to this, Christopher Baines added lzip support […], and FC Stegerman added an optimisation whereby we don’t run apktool if no differences are detected before the signing block […].

A significant number of changes were made to the Reproducible Builds website and documentation this month, including Chris Lamb ensuring the openEuler logo is correctly visible with a white background […], FC Stegerman de-duplicated by email address to avoid listing some contributors twice […], Hervé Boutemy added Apache Maven to the list of affiliated projects […] and boyska updated our Contribute page to remark that the Reproducible Builds presence on salsa.debian.org is not just the Git repository but is also for creating issues […][…]. In addition to all this, however, Holger Levsen made the following changes:

Add a number of existing publications […][…] and update metadata for some existing publications as well […].
Hide draft posts on the website homepage. […]
Add the Warpforge build tool as a participating project of the summit. […]
Clarify in the footer that we welcome patches to the website repository. […]

Testing framework

The Reproducible Builds project operates a comprehensive testing framework at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In October, the following changes were made by Holger Levsen:

Improve the generation of ‘meta’ package sets (used in grouping packages for reporting/statistical purposes) to treat Debian bookworm as equivalent to Debian unstable in this specific case […] and to parse the list of packages used in the Debian cloud images […][…][…].
Temporarily allow Frederic to ssh(1) into our snapshot server as the jenkins user. […]
Keep some reproducible jobs Jenkins logs much longer […] (later reverted).
Improve the node health checks to detect failures to update the Debian cloud image package set […][…] and to improve prioritisation of some kernel warnings […].
Always echo any IRC output to Jenkins’ output as well. […]
Deal gracefully with problems related to processing the cloud image package set. […]

Finally, Roland Clobus continued his work on testing Live Debian images, including adding support for specifying the origin of the Debian installer […] and to warn when the image has unmet dependencies in the package list (e.g. due to a transition) […].

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. You can get in touch with us via:

IRC: #reproducible-builds on irc.oftc.net.
Twitter: @ReproBuilds
Mailing list: rb-general@lists.reproducible-builds.org

← View all our monthly reports