Reproducible Builds in March 2020

View all our monthly reports


Welcome to the March 2020 report from the Reproducible Builds project. In our reports we outline the most important things that we have been up to over the past month and some plans for the future.

What are reproducible builds?

One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security.

However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes.


News

The report from our recent summit in Marrakesh was published and is now available in both PDF and HTML formats. A sincere thank you to all of the Reproducible Builds community for the input to the event a sincere thank you to Aspiration for preparing and collating this report.

Harmut Schorrig published a detailed document on how to compile Java applications in such as way that the .jar build artefact is reproducible across builds. A practical and hands-on guide, it details how to avoid unnecessary differences between builds by explicitly declaring an encoding as the default value differs across Linux and MS Windows systems and ensuring that the generated .jar — a variant of a .zip archive — does not embed any nondeterministic filesystem metadata, and so on.

Janneke gave a quick presentation on GNU Mes and reproducible builds during the lighting talk session at LibrePlanet 2020. []

Vagrant Cascadian presented There and Back Again, Reproducibly! video at SCaLE 18x in Pasadena in California which generated some attention on Twitter.

Hervé Boutemy mentioned on our mailing list in a thread titled Rebuilding and checking Reproducible Builds from Maven Central repository that since the update of a central build script (the “parent POM”) every Apache project using the Maven build system should build reproducibly. A follow-up discussion regarding how to perform such rebuilds was also started on the Apache mailing list.

The Telegram instant-messaging platform announced that they had updated their iOS and Android OS applications and claim that they are reproducible according to their full instructions, verifying that its original source code is exactly the same code that is used to build the versions available on the Apple App Store and Google Play distribution platforms respectfully.

Hervé Boutemy also reported about a new project called reproducible-central which aims to allow anyone to rebuild a component from the Maven Central Repository that is expected to be reproducible and check that the result is as expected.

In last month’s report we detailed Omar Navarro Leija’s work in and around an academic paper titled Reproducible Containers which describes in detail the workings of a user-space container tool called dettrace (PDF). Since then, the PhD student from the University Of Pennsylvania presented on this tool at the ASPLOS 2020 conference in Lausanne, Switzerland. Furthermore, there were contributions to dettrace from the Reproducible Builds community itself. [][]


Distribution work

openSUSE

In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update as well as made the following changes within the distribution itself:

Debian

Chris Lamb further refined his merge request for the debian-installer component to allow all arguments from sources.list files (such as “[check-valid-until=no]”) in order that we can test the reproducibility of the installer images on the Reproducible Builds own testing infrastructure. (#13)

Holger Levsen filed a number of bug reports against the debrebuild tool that attempts to rebuild a Debian package given a .buildinfo file as input, including:

48 reviews of Debian packages were added, 17 were updated and 34 were removed this month adding to our knowledge about identified issues. Many issue types were noticed, categorised and updated by Chris Lamb, including:

Finally, Holger opened a bug report against the software running tracker.debian.org, a service for Debian Developers to follow the evolution of packages via web and email interfaces to request that they integrate information from buildinfos.debian.net (#955434) and Chris Lamb kept isdebianreproducibleyet.com up to date. []


Software development

diffoscope

Chris Lamb made the following changes to diffoscope, the Reproducible Builds project’s in-depth and content-aware diff utility that can locate and diagnose reproducibility issues, including preparing and uploading version 138 to Debian:

  • Improvements:

    • Don’t allow errors with “R” script deserialisation cause the entire operation to fail, for example if an external library cannot be loaded. (#91)
    • Experiment with memoising output from expensive external commands, eg. readelf(#93)
    • Use dumppdf from the python3-pdfminer if we do not see any other differences from pdftext, etc. (#92)
    • Prevent a traceback when comparing two R .rdx files directly as the get_member method will return a file even if the file is missing. []
  • Reporting:

    • Display the supported file formats into the package long description. (#90)
    • Print a potentially-helpful message if the PyPDF2 module is not installed. []
    • Remove any duplicate comparator descriptions when formatting in the --help output or in the package long description. []
    • Weaken “Install the X package to get a better output” message to “… may produce a better output” as the former is not actually guaranteed. []
  • Misc:

    • Ensure we only parse the recommended packages from --list-debian-substvars when we want them for debian/tests/control generation. []
    • Add upstream metadata file [] and add a Lintian override for upstream-metadata-in-native-source as “we” are upstream. []
    • Inline the RequiredToolNotFound.get_package method’s functionality as it is only used once. []
    • Drop the deprecated “py36 = [..]” argument in the pyproject.toml file. []

In addition, Vagrant Cascadian updated diffoscope in GNU Guix to version 138 [], as well as updating reprotest — our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences — to version 0.7.14 [].

Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month we wrote a large number of such patches, including:

Project documentation

There was further work performed on our documentation and website this month including Alex Wilson adding a section regarding using Gradle for reproducible builds in JVM projects [] and Holger Levsen added the report from our recent summit in Marrakesh [][].

In addition, Chris Lamb made a number of changes, including correcting the syntax of some CSS class formatting [], improved some “filed against” copy a little better [] and corrected a reference to calendar.monthrange Python method in a utility function. []

Testing framework

We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org that, amongst many other tasks, tracks the status of our reproducibility efforts as well as identifies any regressions that have been introduced.

This month, Chris Lamb reworked the web-based package rescheduling tool to:

  • Require a HTTP POST method in the web-based scheduler as not only should HTTP GET requests be idempotent but this will allow many future improvements in the user interface. [][][]
  • Improve the authentication error message in said rescheduler to suggest that the developer’s SSL certificate may have expired. []

In addition, Holger Levsen made the following changes:

  • Add a new ath97 subtarget for the OpenWrt distribution.
  • Revisit ordering of Debian suites; sort the experimental distribution last and reverse the ordering of suites to prioritise the suites in development. [][][]
  • Schedule Debian buster and bullseye a little less in order to allow unstable to catch up on the i386 architecture. [][]
  • Various cosmetic changes to the web-based scheduler. [][][][]
  • Improve wordings in the node health maintenance output. []

Lastly, Vagrant Cascadian updated a link to the (formerly) weekly news to our reports page [] and kpcyrd fixed the escaping in an Alpine Linux inline patch []. The usual build nodes maintenance was performed by Holger Levsen [][], Mattia Rizzolo [] and Vagrant Cascadian [][].


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:


This month’s report was written by Bernhard M. Wiedemann, Chris Lamb, Holger Levsen and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.




View all our monthly reports