Reproducible Builds in May 2021

View all our monthly reports


Welcome to the May 2021 report from the Reproducible Builds project

In these reports we try to highlight the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. If you are interested in contributing to the project, please visit our Contribute page on our website.


The president of the United States signed an executive order this month outlining policies aimed to improve the cybersecurity in the US. The executive order comes after a number of highly-publicised security problems such as a ransomware attack that affected an oil pipeline between Texas and New York and the SolarWinds hack that affected a large number of US federal agencies.

A summary of the (8,000-word) document is available, but section four is relevant in the context of reproducible builds. Titled “Enhancing Software Supply Chain Security”, it outlines a plan that might involve:

requiring developers to maintain greater visibility into their software and making security data publicly available. It stands up a concurrent public-private process to develop new and innovative approaches to secure software development and uses the power of Federal procurement to incentivize the market. Finally, it creates a pilot program to create an “energy star” type of label so the government – and the public at large – can quickly determine whether software was developed securely.

In response to this Executive Order, the US National Institute of Standards and Technology (NIST) announced that they would host a virtual workshop in early June to both respond and attempt to fulfill its terms. In addition, David Wheeler published a blog post on the Linux Foundation’s blog on the topic. Titled How LF communities enable security measures required by the US Executive Order on Cybersecurity, David’s post explicitly mentions reproducible builds, particularly the Yocto Project’s support for fully-reproducible builds.


David A. Wheeler posted to our mailing list, to announce that the public defense of his Fully Countering Trusting Trust through Diverse Double-Compiliing (DDC) PhD thesis at George Mason University is now available online.


Dan Shearer announced a new tool called “Not-Forking which attempts to avoid duplicating the source code of one project within another. This is highly relevant in the context of reproducible builds, as embedded code copies are often the cause of reproducibility: in many cases, addressing the problem upstream (and then ensuring a fixed version is available in distributions) is not a sufficient fix, as any embedded code copies remain unaffected. (This has been observed a number of times, particularly with embedded copies of help2man and similar documentation generation tools.)


Due to the recent upheavals on the Freenode IRC network, the #archlinux-reproducible has moved to Libera Chat. (The more general #reproducible-builds IRC channel, which is hosted on the OFTC network, has not moved.)


On our mailing list, Marcus Hoffman started a thread after finding that he was unable to hunt down the cause of a unreproducible build of an Android APK package which Bernhard M. Wiedemann managed to track down to a ‘pg-map-id’ field and a related checksum. This resulted in an issue being reported against Google’s Android toolchain which, as Marcus himself wrote, “hope it get’s fixed this year”.


Roland Clobus reported on his progress towards making the Debian ‘Live’ image reproducible on our mailing list this month, coordinating with Holger Levsen to add automatic, daily testing of Live images and producing diffoscope reports if not. Elsewhere in Debian, 9 reviews of Debian packages were added, 8 were updated and 29 were removed this month adding to our knowledge about identified issues. Chris Lamb also identified a new random_uuid_in_notebooks_generated_by_nbsphinx toolchain issue.

Software development

Upstream patches

diffoscope

diffoscope is the Reproducible Builds project in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it provides human-readable diffs from many kinds of binary formats. This month, Chris Lamb made a number of changes including releasing version 174, version 175 and version 176:

  • Bug fixes:

    • Check that we are parsing an actual Debian .buildinfo file, not just a file with that particular extension — after all, it could be any file. (#254, #987994)
    • Support signed .buildinfo files again. It appears that some versions of file(1) reports them as PGP signed message. []
    • Use the actual filesystem path name (instead of diffoscope’s concept of the source archive name) in order to correct filename filtering when an APK file has been extracted from a container format. In particular, we need to filter the auto-incremented 1.apk instead of original-name.pk. (#255)
  • New features:

    • Update ffmpeg tests to work with version 4.4. (#258)
    • Correct grammar in a fsimage.py debug message. []
  • Misc:

    • Don’t unnecessarily call os.path.basename twice in the Android APK comparator. []
    • Added instructions on how to install diffoscope on openSUSE on the diffoscope website [].
    • Add a comment about stripping filenames. []
    • Corrected a reference to site.salsa_url which was breaking the “File a new issue” link on the website. []

In addition:

  • Keith Smiley:

    • Improve support for Apple provisioning profiles. []
    • Fix ignoring objdump-related tests on MacOS. MacOS has a version of objdump(1) that doesn’t support --info so the tests would fail on that operating system. []
  • Mattia Rizzolo:

    • Fix recognition of compressed .xz archives with file(1) version 5.40. [][]
    • Embed small test fixture in the code itself, rather than a separate file. []

strip-nondeterminism

Chris Lamb made the following changes to strip-nondeterminism, our tool to remove specific non-deterministic results from a completed build:

  • Added support for Python pyzip files: they require special handling to not mangle the UNIX shebang. (#18)

  • Dropped single-debian-patch, etc. from the Debian source package options. []

  • Version 1.12.0-1 was uploaded to Debian unstable by Chris Lamb.

Website and documentation

Quite a few changes were made to the main Reproducible Builds website and documentation this month, including:

  • Arnout Engelen:

    • Add a section regarding contributing to NixOS. []
  • Chris Lamb:

    • Incorporate Holger Levsen’s suggestion to improve the homepage text. []
  • Holger Levsen:

    • Make the contribute page look a bit less like it is ‘under construction’, including explaining how we care about all distros and projects. [][][][]
    • Create an Arch Linux contribution page. [][]
    • Make sponsor link visible in the sidebar. []
  • Ian Muchina:

    • Add syntax highlight styles. []
  • Jelle van der Waa:

  • Ludovic Courtès:

    • Explain how to contribute to reproducible builds related to GNU Guix. []
  • Roland Clobus:

    • Added a trailing slash, fixing access to the Debian and Archlinux contribution pages. []
    • Fix markup as reported by msgfmt. []

Testing framework

The Reproducible Builds project operates a Jenkins-based testing framework that powers tests.reproducible-builds.org. This month, the following changes were made:

  • Holger Levsen:

    • Automatic node health check improvements:

    • Improvements to the common-functions.sh library:

      • Set a more sensible default for the locale early on. []
      • Various visual improvements, including changes to script output. [][]
      • Improvement debug output. [][]
      • Only notify an IRC channel if a channel is actually configured. []
    • Improvements to cleanup routines:

      • Cleanup sbuild(1) directories using sudo(8) after three days. [][][]
      • Loosen a regular expression to detect failures when removing stuff. []
    • Misc:

      • Increase kernel inotify(7) watch limit further on all hosts. The value is now four times the default now. []
      • Don’t try to install the devscripts package from the buster-backports distribution. []
      • Improve grammar in some comments that are seen every day. []
  • Mattia Rizzolo:

    • Stop filtering out build failures due to -ffile-prefix-map: this flag is the default for the official dpkg package, so these are now “real” build failures. []
    • Export package ‘not for us’ (NFU) and ‘blacklist’ states in the reproducible.json file, but keep excluding them from tracker.json. []
    • Update the IP addresses of armhf architecture hosts. []
    • Properly alternate between -amd64 and -686 Debian kernels on the i386 architecture builders. []
    • Disable the man-db package everywhere, to save time in virtually all apt install/upgrade actions. []
  • Vagrant Cascadian:

    • Add new armhf architecture build nodes. [][]
    • Retire all machines with only 2GiB of ram. []
    • Drop Debian buster kernel configurations for cbxi4* and wbq0 hosts. []
    • Keep imx6 systems running Debian buster kernels. []
    • Prepare to switch armhf nodes over to Debian bullseye. [][]

Finally, build node maintenance was performed by Holger Levsen [][][] and Vagrant Cascadian [][][][]


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:


This month’s report was written by Chris Lamb, Holger Levsen and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.




View all our monthly reports