Reproducible Builds in April 2019

View all our monthly reports


Welcome to the April 2019 report from the Reproducible Builds project! In these now-monthly reports we will outline the most important things which we have been up to in and around the world of reproducible builds & secure toolchains.

As a quick recap, whilst anyone can inspect the source code of free software for malicious flaws, almost all software is distributed to end users pre-compiled. The motivation behind reproducible builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised.

In this month’s report, we will cover:

  • Media coverageCompromised toolchains, what makes a good digital product?, etc.
  • Upstream newsScala and Go working on reproducibility, etc.
  • Distribution workDistributing build certificates, an update from openSUSE, etc.
  • Software developmentNew features in diffoscope, yet more test framework development, etc
  • Misc newsFrom our mailing list, etc.
  • Getting in touchHow to contribute, etc

Media coverage

  • The SecureList website reported on Operation “ShadowHammer”, a high-profile supply chain attack involving the ASUS Live Update Utility. As their post describes in more detail tampering with binaries would usually break the digital signature, but in this case the digital signature itself appeared to have been compromised. (Read more)

Upstream news

The first non-trivial library written in the Scala programming language on the Java Virtual Machine was released with Arnout Engelen’s sbt-reproducible-builds plugin enabled during the build. This resulted in Akka 2.5.22 becoming reproducible, both for the artifacts built with version 2.12.8 and 2.13.0-RC1 of the Scala compiler. For 2.12.8, the original release was performed on a Mac and the validation was done on a Debian-based machine, so it appears the build is reproducible across diverse systems. (Mailing list thread)

Jeremiah “DTMB” Orians announced the 1.3.0 release of M2-Planet, a self-hosting C compiler written in a subset of the features it supports. It has been bootstrapped entirely from hexadecimal (!) with 100% reproducible output/binaries. This new release sports a self-hosting port for an additional architecture amongst other changes. Being “self-hosted” is an important property as it can provide a method of validating the legitimancy of the build toolchain.

The Go programming language has been making progress in making their builds reproducible. In 2016, Ximin Luo had created issue #16860 requesting that the compiler generates the same result regardless of the path in which the package is built. However, progress was recently made in Change #173344 (and adjacent) that will permit a -trimpath mode that will generate binaries that do not contain any local path names, similar to -fpath-prefix-map.

The fontconfig library for configuring and customising font access in a number of distributions announced they had merged patches to allow various cache files to be reproducible. This is after Chris Lamb posted a historical summary and a request for action to Fontconfig’s mailing list in January 2019

Distribution work

In Debian, Chris Lamb added 90 reviews of Debian packages, adding to our knowledge about identified issues and 14 issues were automatically removed. Chris also added two issue types: build_date_in_egg_info_directory_name & randomness_in_perl6_precompiled_libraries.

Holger Levsen started a discussion regarding the distribution of .buildinfo files. These files record the environment that was used as part of a particular build in order that — along with the source code — ensure that the aforementioned environment can be recreated at a later date to reproduce the exact binary. Distributing these files is important so that others can validate that a build is actually reproducible. In his post, Holger refers to two services that now exist, buildinfo.debian.net and buildinfos.debian.net.

In addition, Holger restarted a long-running discussion regarding the reproducibility status of Debian buster touching on questions of potentially performing mass rebuilds of all packages in order that they use updated toolchains.

There was yet more progress towards making the Debian Installer images reproducible. Following-on from last months, Chris Lamb performed some further testing of the generated images. Cyril Brulebois then made an upload of the debian-installer package to Debian that included a number of Chris’ patches and Vagrant Cascadian filed a patch to fix the reproducibility of “u-boot” images by using -n argument to gzip(1).

Bernhard M. Wiedemann posted his monthly Reproducible Builds status update for the openSUSE distribution. Bernhard also posted to our mailing list regarding enabling the normalisation of file modification times in Python .pyc files and opened issue #1133809 in the openSUSE bug tracker.


Software development

Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

diffoscope

diffoscope is our in-depth “diff-on-steroids” utility which helps us diagnose reproducibility issues in packages. It does not define reproducibility, but rather provides a helpful and human-readable guidance for packages that are not reproducible, rather than relying essentially-useless diffs.

This month, Chris Lamb did a lot of development of diffoscope, including:

  • Updating the certificate of the try.diffoscope.org web-based version of the tool.

  • Uploaded version 114 to the Debian experimental distribution and made the corresponding upload to the PyPI package repository.

  • Added support for semantic comparison of GnuPG “keybox” (.kbx) files. (#871244)

  • Add the ability to treat missing tools as failures if a “magic” environment variable is detected in order to facilitate interpreting required tools on the Debian autopkgtests as actual test failures, rather than skipping them. The behaviour of the existing testsuite remains unchanged. (#905885)

  • Filed a “request for packaging” for the annocheck tool which can be used to “analyse an application’s compilation”. This is as part of an outstanding wishlist issue. (#926470)

  • Consolidated on a single alias as the exception value across the entire codebase. []

In addition, Vibhu Agrawal ensured that diffoscope failed more gracefully when running out of diskspace to resolve Debian bug #874582 and Vagrant Cascadian updated to diffoscope 114 in GNU Guix. Thanks!

strip-nondeterminism

strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. It is used automatically in most Debian package builds. This month, Chris Lamb made the following improvements:

  • Workaround Archive::Zip’s incorrect handling of the localExtraField class member field by monkey-patching the accessor methods to always return normalised values. This fixes the normalisation of Unix ownership metadata within .zip and .epub files. (#858431)

  • Actually check the return status from Archive::Zip when writing file to disk. []

  • Catch an edge-case where we can’t parse the length of a particular field within .zip files. []

Chris then uploaded version 1.1.3-1 to the Debian experimental distribution.

Project website

Chris Lamb made a number of improvements to our project website this month, including:

  • Using an explicit “draft” boolean flag for posts. Jekyll in Debian stable silently (!) does not support the where_exp filter. []

  • Moving more pages away from the old design with HTML to Markdown formatting and the new design template. []

  • Adding a simple Makefile to implicitly document how to build the site [] and add a simple .gitlab-ci.yml to test branches/builds [].

  • Adding as simple “lint” command so we can see how many pages are using the old style. []

  • Adding an explicit link to our “Who is involved?” page in the footer of the newer design [] and add a link to donation page [].

  • Moved various bits of infrastructure to support a monthly report structure. []

Test framework

We operate a comprehensive Jenkins-based testing framework that powers tests.reproducible-builds.org. The following changes were done in the last month:

  • Holger Levsen (Debian-related changes):

    • Add new experimental buildinfos.debian.net service. [][][]
    • Allow pushing of .buildinfo files from coccia. []
    • Permit rsync to write into subdirectories. []
    • Include the meta “pool” job in the overall job health view. []
    • Add support for host-specific SSH authorized_keys files used on a particular build node. []
    • Show link to maintenance jobs for offline nodes. [][]
    • Increase the job timeout for some runners from 3 to 5 days. []
    • Don’t try to turn Jenkins or nodes offline too quickly. [][]
    • Fix pbuilder lock files if necessary. []
  • Mattia Rizzolo:

    • Special-case the debian-installer package when building to allow it access to the internet.. []
    • Force installing the debootstrap from stretch backports and remove cdebootstrap. []
    • Install the python3-yaml package on nodes as it is needed by the deploy script. []
    • Add/update the new reproducible-builds.org MX records. [][]
    • Fix typo in comment; thanks to ijc for reporting! []

Holger Levsen [][][], Mattia Rizzolo [] and Vagrant Cascadian [] all performed a large amount of build node maintenance, system & Jenkins administration and Chris Lamb provided a patch to avoid double spaces in IRC notifications [].


Misc news


Getting in touch

If you are interested in contributing the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:



This month’s report was written by Arnout Engelen, Bernhard M. Wiedemann, Chris Lamb, Holger Levsen, Mattia Rizzolo and Vagrant Cascadian & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.




View all our monthly reports