Reproducible Builds in August 2022

View all our monthly reports


Welcome to the August 2022 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. The motivation behind the reproducible builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised.

As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.

Community news

As announced last month, registration is currently open for our in-person summit this year which is due to be held between November 1st → November 3rd. The event will take place in Venice (Italy). Very soon we intend to pick a venue reachable via the train station and an international airport. However, the precise venue will depend on the number of attendees. Please see the announcement email for information about how to register.


The US National Security Agency (NSA), Cybersecurity and Infrastructure Security Agency (CISA) and the Office of the Director of National Intelligence (ODNI) have released a document called “Securing the Software Supply Chain: Recommended Practices Guide for Developers” (PDF) as part of their Enduring Security Framework (ESF) work.

The document expressly recommends having reproducible builds as part of “advanced” recommended mitigations, along with hermetic builds. Page 31 (page 35 in the PDF) says:

Reproducible builds provide additional protection and validation against attempts to compromise build systems. They ensure the binary products of each build system match: i.e., they are built from the same source, regardless of variable metadata such as the order of input files, timestamps, locales, and paths. Reproducible builds are those where re-running the build steps with identical input artifacts results in bit-for-bit identical output. Builds that cannot meet this must provide a justification why the build cannot be made reproducible.

The full press release is available online.


On our mailing list this month, Marc Prud’hommeaux posted a feature request for diffoscope which additionally outlines a project called The App Fair, an autonomous distribution network of free and open-source macOS and iOS applications, where “validated apps are then signed and submitted for publication”.


Author/blogger Cory Doctorow posted published a provocative blog post this month titled “Your computer is tormented by a wicked god”. Touching on Ken Thompson’s famous talk, “Reflections on Trusting Trust”, the early goals of “Secure Computing” and UEFI firmware interfaces:

This is the core of a two-decade-old debate among security people, and it’s one that the “benevolent God” faction has consistently had the upper hand in. They’re the “curated computing” advocates who insist that preventing you from choosing an alternative app store or side-loading a program is for your own good – because if it’s possible for you to override the manufacturer’s wishes, then malicious software may impersonate you to do so, or you might be tricked into doing so. [..] This benevolent dictatorship model only works so long as the dictator is both perfectly benevolent and perfectly competent. We know the dictators aren’t always benevolent. […] But even if you trust a dictator’s benevolence, you can’t trust in their perfection. Everyone makes mistakes. Benevolent dictator computing works well, but fails badly. Designing a computer that intentionally can’t be fully controlled by its owner is a nightmare, because that is a computer that, once compromised, can attack its owner with impunity.


Lastly, Chengyu HAN updated the Reproducible Builds website to correct an incorrect Git command. []

Debian

In Debian this month, the essential and required package sets became 100% reproducible in Debian bookworm on the amd64 and arm64 architectures. These two subsets of the full Debian archive refer to Debian package “priority” levels as described in the §2.5 Priorities section of the Debian Policy — there is no canonical “minimal installation” package set in Debian due to its diverse methods of installation.

As it happens, these package sets are not reproducible on the i386 architecture because the ncurses package on that architecture is not yet reproducible, and the sed package currently fails to build from source on armhf too. The full list of reproducible packages within these package sets can be viewed within our QA system, such as on the page of required packages in amd64 and the list of essential packages on arm64, both for Debian bullseye.


It recently has become very easy to install reproducible Debian Docker containers using podman on Debian bullseye:

$ sudo apt install podman
$ podman run --rm -it debian:bullseye bash

The (pre-built) image used is itself built using debuerrotype, as explained on docker.debian.net. This page also details how to build the image yourself and what checksums are expected if you do so.


Related to this, it has also become straightforward to reproducibly bootstrap Debian using mmdebstrap, a replacement for the usual debootstrap tool to create Debian root filesystems:

$ SOURCE_DATE_EPOCH=$(date --utc --date=2022-08-29 +%s) mmdebstrap unstable > unstable.tar

This works for (at least) Debian unstable, bullseye and bookworm, and is tested automatically by a number of QA jobs set up by Holger Levsen (unstable, bookworm and bullseye)


Work has also taken place to ensure that the canonical debootstrap and cdebootstrap tools are also capable of bootstrapping Debian reproducibly, although it currently requires a few extra steps:

  1. “Clamping” the modification time of files that are newer than $SOURCE_DATE_EPOCH to be not greater than SOURCE_DATE_EPOCH.

  2. Deleting a few files. For debootstrap, this requires the deletion of /etc/machine-id, /var/cache/ldconfig/aux-cache, /var/log/dpkg.log, /var/log/alternatives.log and /var/log/bootstrap.log, and for cdebootstrap we also need to delete the /var/log/apt/history.log and /var/log/apt/term.log files as well.

This process works at least for unstable, bullseye and bookworm and is now being tested automatically by a number of QA jobs setup by Holger Levsen [][][][][][]. As part of this work, Holger filed two bugs to request a better initialisation of the /etc/machine-id file in both debootstrap [] and cdebootstrap [].


Elsewhere in Debian, 131 reviews of Debian packages were added, 20 were updated and 27 were removed this month, adding to our extensive knowledge about identified issues. Chris Lamb added a number of issue types, including: randomness_in_browserify_output [], haskell_abi_hash_differences [], nondeterministic_ids_in_html_output_generated_by_python_sphinx_panels []. Lastly, Mattia Rizzolo removed the deterministic flag from the captures_kernel_variant flag [].

Other distributions

Vagrant Cascadian posted an update of the status of Reproducible Builds in GNU Guix, writing that:

Ignoring the pesky unknown packages, it is more like ~93% reproducible and ~7% unreproducible... that feels a bit better to me!

These numbers wander around over time, mostly due to packages moving back into an "unknown" state while the build farms catch up with each other... although the above numbers seem to have been pretty consistent over the last few days.

The post itself contains a lot more details, including a brief discussion of tooling.

Elsewhere in GNU Guix, however, Vagrant updated a number of packages such as itpp [], perl-class-methodmaker [], libnet [], directfb [] and mm-common [], as well as updated the version of reprotest to 0.7.21 [].

In openSUSE, Bernhard M. Wiedemann published his usual openSUSE monthly report.

diffoscope

diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 220 and 221 to Debian, as well as made the following changes:

  • Update external_tools.py to reflect changes to xxd and the vim-common package. []
  • Depend on the dedicated xxd package now, not the vim-common package. []
  • Don’t crash if we can open a PDF file using the PyPDF library, but cannot subsequently parse the annotations within. []

In addition, Vagrant Cascadian updated diffoscope in GNU Guix, first to to version 220 [] and later to 221 [].

Community news

The Reproducible Builds project aims to fix as many currently-unreproducible packages as possible as well as to send all of our patches upstream wherever appropriate. This month we created a number of patches, including:

Testing framework

The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, Holger Levsen made the following changes:

  • Debian-related changes:

    • Temporarily add Debian unstable deb-src lines to enable test builds a Non-maintainer Upload (NMU) campaign targeting 708 sources without .buildinfo files found in Debian unstable, including 475 in bookworm. [][]
    • Correctly deal with the Debian Edu packages not being installable. []
    • Finally, stop scheduling stretch. []
    • Make sure all Ubuntu nodes have the linux-image-generic kernel package installed. []
  • Health checks & view:

    • Detect SSH login problems. []
    • Only report the first uninstallable package set. []
    • Show new bootstrap jobs. [] and debian-live jobs. [] in the job health view.
    • Fix regular expression to detect various zombie jobs. []
  • New jobs:

    • Add a new job to test reproducibility of mmdebstrap bootstrapping tool. [][][][]
    • Run our new mmdebstrap job remotely [][]
    • Improve the output of the mmdebstrap job. [][][]
    • Adjust the mmdebstrap script to additionally support debootstrap as well. [][][]
    • Work around mmdebstrap and debootstrap keeping logfiles within their artifacts. [][][]
    • Add support for testing cdebootstrap too and add such a job for unstable. [][][]
    • Use a reproducible value for SOURCE_DATE_EPOCH for all our new bootstrap jobs. []
  • Misc changes:

    • Send the create_meta_pkg_sets notification to #debian-reproducible-changes instead of #debian-reproducible. []

In addition, Roland Clobus re-enabled the tests for live-build images [] and added a feature where the build would retry instead of give up when the archive was synced whilst building an ISO [], and Vagrant Cascadian added logging to report the current target of the /bin/sh symlink [].

Contact

As ever, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:




View all our monthly reports