Reproducible Builds in April 2021

View all our monthly reports


Welcome to the April 2021 report from the Reproducible Builds project!

In these reports we try to the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. If you are interested in contributing to the project, please visit our Contribute page on our website.


A preprint of a paper by Chris Lamb and Stefano Zacchiroli (which will shortly appear in IEEE Software) has been made available on the arXiv.org service. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains (PDF), the abstract of the paper contains the following:

We first define the problem, and then provide insight into the challenges of making real-world software build in a “reproducible” manner-this is, when every build generates bit-for-bit identical results. Through the experience of the Reproducible Builds project making the Debian Linux distribution reproducible, we also describe the affinity between reproducibility and quality assurance (QA). []


Elsewhere on the internet, Igor Golovin on the Kaspersky security blog reported that APKPure, an alternative app store for Android apps, began to distribute “an advertising SDK from an unverified source [that] turned out to be malicious” []. Elaborating elsewhere on the internet, Igor wrote that the malicious code had “much in common with the notorious Triada malware and can perform a range of actions: from displaying and clicking ads to signing up for paid subscriptions and downloading other malware”.


Closer to home, Jeremiah Orians wrote to our mailing list reporting that it is now possible to bootstrap the GCC compiler without using the pre-generated Bison grammar files, part of a broader attempt to provide a “reproducible, automatic [and] complete end-to-end bootstrap from a minimal number of binary seeds to a supported fully functioning operating system” []. In addition, Richard Clobus started a thread on potential problems the -Wl,--build-id=sha1 linker flag which can later be used when analysing core dumps and tracebacks. According to the Red Hat Customer Portal:

Each executable or shared library built with Red Hat Enterprise Linux Server 6 or later is assigned a unique identification 160-bit SHA-1 string, generated as a checksum of selected parts of the binary. This allows two builds of the same program on the same host to always produce consistent build-ids and binary content. (emphasis added)


Lastly, Felix C. Stegerman reported on the latest release of apksigcopier. apksigcopier is a tool to copy, extract and patch .apk signatures that is needed to facilitate reproducible builds on the F-Droid Android application store and elsewhere. Holger Levsen subsequently sponsored an upload to Debian.


Software development

Distribution work

An issue was discovered in Arch Linux regarding packages that where previously considered reproducible. After some investigation, it was determined that the build’s CFLAGS could vary between two previously ‘reproducible’ builds

The cause was attributed the fact that in Arch Linux, the devtools package determines the build configuration, but in the development branch it had been inadvertently copying the makepkg.conf file from the pacman package — the devtools version had been fixed in the recent release. This meant that when Arch Linux released or releases a devtools package with updated CFLAGS` (or similar), old packages could fail to build reproducibly as they would be reproduced in a different build environment.

To address this problem, Levente Polyak sent a patch to the pacman mailing list to include the version of devtools in the relevant BUILDINFO file. This means that the repro tool can now install the corresponding makepkg.conf file when attempting to validate a reproducible build.


In Debian, Frédéric Pierret continued working on debian.notset.fr, a partial copy of the snapshot.debian.org “wayback machine” service for the Debian archive that is limited to the packages needed to rebuild the bullseye distribution on the amd64 architecture. This is to workaround some perceived limitations of snapshot.debian.org. Since last month, the service covers from mid-2020 onwards, and request was made to the Debian sysadmin team to obtain better access to snapshot.debian.org in order to further accelerate the initial seeding. In addition, the service supports now more endpoints in the API (full documentation), including a timestamp endpoint to track the sync in a machine-readable way.

Twenty-one reviews of Debian packages were performed, nine were updated and sixteen were removed this month adding to our large taxonomy of identified issues. A number of issue types have been updated too, including removing the random_order_in_javahelper_substvars issue type [], but also the addition of a new timestamps_in_pdf_generated_by_libreoffice toolchain issue by Chris Lamb [].


Lastly, Bernhard M. Wiedemann posted his monthly reproducible builds status report for the openSUSE distribution.

Upstream patches

diffoscope

diffoscope is the Reproducible Builds project in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it provides human-readable diffs from many kinds of binary formats. This month, Chris Lamb made a number of changes including releasing version 172 and version 173:

  • Add support for showing annotations in PDF files. (#249)
  • Move to the assert_diff helper in `test_pdf.py. []

In addition, Mattia Rizzolo attempted to make the testsuite pass with file(1) version 5.40 [] and Zachary T. Welch updated the __init__ magic method of the Difference class to demote the unified_diff argument to a Python ‘kwarg’ [].

Website and documentation

Quite a few changes were made to the main Reproducible Builds website and documentation this month, including:

  • Chris Lamb:

    • Highlight our mailing list on the Contribute. page []
    • Add a noun (and drop an unnecessary full-stop) on the landing page. [][]
    • Correct a reference to the date metadata attribute on reports, restoring the display of months on the homepage. []
    • Correct a typo of “instalment” within a previous news entry. []
    • Added a conspicuous “draft” banner to unpublished blog posts in order to match the report draft banner. []
  • Mattia Rizzolo:

    • Various improvements to the sponsors pages. [][][]
    • Add the project’s platinum-level sponsors to the homepage. []
    • Use a CSS class instead of specifying an inline style HTML attribute. []

Testing framework

The Reproducible Builds project operates a Jenkins-based testing framework that powers tests.reproducible-builds.org. This month, the following changes were made:

  • Holger Levsen:

    • Debian:

      • Update README to reflect that Debian buster is now the ‘stable’ distribution. []
      • Support fully-qualified domain names in the powercycle script for the armhf architecture. []
      • Improve the handling of node names (etc.) for armhf nodes. [][]
      • Improve the detection and classification of packages maintained by the Debian accessibility team. [][]
      • Count the number configured armhf nodes correctly by ignoring comments at the end of line. []
    • Health checks

      • Improve checks for broken OpenSSH ports. [][]
      • Detect failures of NetBSD’s make release. []
      • Catch another log message variant that specifies a host is running an outdated kernel. []
      • Automatically restart failed systemd-journal-flush systemd services. []
    • Other:

      • Update FreeBSD to 13.0. []
      • Be less picky about “too many” installed kernels on hosts which have large enough /boot partition. [][][]
      • Unify some IRC output. []
  • Mattia Rizzolo:

    • Fix a regular expression in automatic log-parsing routines. []
  • Vagrant Cascadian:

    • Add some new armhf architecture build nodes, virt32b and virt64b. [][][]
    • Rearrange armhf build jobs to only use active nodes. []
    • Add a health check for broken OpenSSH ports on virt32a. []
    • Mark which armhf architecture jobs are not systematically varying 32-bit and 64-bit kernels. []
    • Disable creation of Debian stretch build tarballs and update the README file to mention bullseye instead. []

Finally, build node maintenance was performed by Holger Levsen [][][][], Mattia Rizzolo [] and Vagrant Cascadian [] [][].


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:




View all our monthly reports