Reproducible Builds in December 2020

View all our monthly reports


Greetings and welcome to the December 2020 report from the Reproducible Builds project. In these monthly reports, we try to outline most important things that have happened in and around the Reproducible Builds project.


In mid-December, it was announced that there was a substantial and wide-reaching supply-chain attack that targeted many departments of the United States government including the Treasury, Commerce and Homeland Security (DHS). The attack, informally known as ‘SolarWinds’ after the manufacturer of the network management software that was central to the compromise, was described by the Washington Post as:

The far-reaching Russian hack that sent U.S. government and corporate officials scrambling in recent days appears to have been a quietly sophisticated bit of online spying. Investigators at cybersecurity firm FireEye, which itself was victimized in the operation, marveled that the meticulous tactics involved “some of the best operational security” its investigators had seen, using at least one piece of malicious software never previously detected.

This revelation is extremely relevant to Reproducible Builds project because, according to the SANS Institute, it appears that the source code and distribution systems were not compromised — instead, the build system was, and is therefore precisely the kind of attack that reproducible builds is designed to prevent. The SolarWinds attack is further evidence that reproducible builds is important and that it becomes a pervasive software engineering principle.

More information on the attack may be found on CNN, CSO ComputerWeekly, BBC News, etc., and David A. Wheeler started a discussion on our mailing list. Kim Setter, author of Countdown to Zero Day: Stuxnet and the Launch of the World’s First Digital Weapon, posted on Twitter that:


Last month, we reported on a fork of the official German Corona App called ‘Corona Contact Tracing Germany’. Since then, the application has been made available on the F-Droid free-software application store. The app is not using the proprietary Google exposure notification framework, but a free software reimplementation by the microG project, staying fully compatible with the official app. The version on F-Droid also supports reproducible builds, and instructions on how to rebuild the package are available from the upstream Git repository. (FSFE’s announcement.)

The Reproducible Central project is an attempt to rebuild binaries published to the Maven Central Repository in a reproducibility context. This month, Hervé Boutemy announced that Reproducible Central was able to successfully rebuild the 100th reproducible release of a project published to the Maven Central Repository (and counting…).

We also first wrote about the Threema messaging application in September 2020. This month, however, the Threema developers announced that their applications have been released under the GNU Affero General Public License (AGPL) ( announcement) and that they are now reproducible from version 4.5-beta1 onwards. (Spiegel.de announcement.)

Community news

Vagrant Cascadian announced that there will be another Reproducible Builds ‘office hours’ session on Thursday January 7th, where members of the Reproducible Builds project will be available to answer any questions. (More info.)

On our mailing list, Jeremiah Orians sent a brief status update on the status of the Bootstrappable project, noting that it is now possible to build a C compiler requiring “nothing outside of a POSIX kernel”. Bernhard M. Wiedemann also published the minutes of a recent debugging-oriented meeting.

Chris Lamb recently took part in an interview with an intern at the Software Freedom Conservancy to talk about the Reproducible Builds project and the importance of reproducibility in software development:

VB: How would you relate the importance of reproducibility to a user who is non-technical?

CL: I sometimes use the analogy of the food ‘supply chain’ to quickly relate our work to non-technical audiences. The multiple stages of how our food reaches our plates today (such as seeding, harvesting, picking, transportation, packaging, etc.) can loosely translate to how software actually ends up on our computers, particularly in the way that if any of the steps in the multi-stage food supply chain has an issue then it quickly becomes a serious problem.

The full interview can be found on the Conservancy webpages.

Distributions

openSUSE

Adrian Schröter added an option to the scripts powering the Open Build Service to enable deterministic filesystem ordering. Whilst this degrades performance slightly, it also enables dozens of packages in openSUSE Tumbleweed to become reproducible. [] Also, Bernhard M. Wiedemann published his monthly Reproducible Builds status update for openSUSE Tumbleweed.

Debian

In Debian, Holger Levsen uploaded 540 packages to the unstable distribution that were missing .buildinfo files for Architecture: all packages. Holger described his rationale and approach in a blog post titled On doing 540 no-source-change source-only uploads in two weeks, and also he posted the full list of packages he intends to upload during January 2021 to the debian-devel mailing list:

There are many binary (and source) packages in Debian which were uploaded before 2016 (which is when .buildinfo files were introduced) or were uploaded with binaries until that change in release policy July 2019.

Ivo De Decker scheduled binNMUs for all the affected packages but due to the way binNMUs work, he couldn’t do anything about arch:all packages as they currently cannot be rebuilt with binNMUs.

In recent months, Debian Developer Stuart Prescott has been improving python-debian, a Python library that is used to parse Debian-specific files such as changelogs, .dscs, etc. In particular, Stuart has been working on adding support for .buildinfo files used for recording reproducibility-related build metadata. This month, however, Stuart uploaded python-debian version 0.1.39 with many changes, including adding a type for .buildinfo files (#875306).

Chris Lamb identified two new issues (timestamps_in_3d_files_created_by_survex & build_path_in_direct_url_json_file_generated_by_flit), and Vagrant Cascadian discovered four ecbuild-related issues (records_build_flags_from_ecbuild, captures_kernel_version_via_ecbuild, captures_build_arch_via_ecbuild & timestamps_in_h_generated_by_ecbuild). 94 reviews of Debian packages were added, 84 were updated and 34 were removed this month, adding to our knowledge about identified issues.

Vagrant Cascadian made a large number of uploads to Debian fix a number of reproducible issues in packages that do not have an owner, including a2ps (4.14-6), autoconf (2.69-13 & 2.69-14), calife (3.0.1-6), coinor-symphony (5.6.16+repack1-3), epm (4.2-9 & 4.2-10), grap (1.45-4), hpanel (0.3.2-7), libcommoncpp2 (1.8.1-9 & 1.8.1-10), libdigidoc (3.10.5-2), libnss-ldap (265-6), lprng (3.8.B-5), magicfilter (1.2-66), massif-visualizer (0.7.0-2), milter-greylist (4.6.2-2), minlog (4.0.99.20100221-7), mp3blaster (3.2.6-2), nis (3.17.1-6 & 3.17.1-8), spamassassin-heatu (3.02+20101108-4), webauth (4.7.0-8) & wily (0.13.41-9 & 0.13.41-10).

Similarly, Chris Lamb made two uploads of the sendfile package.

NixOS

NixOS made good progress towards having all packages required to build the minimal installation ISO image reproducible. Remaining work includes the python, isl and gcc9 packages and removing the use of Python 2.x in asciidoc.

Elsewhere in NixOS, Adam Hoese of tweag.io also announced trustix, an NGI Zero PET-funded initiative to provide infrastructure for sharing and enforcing reproducibility results for Nix-based systems.

Finally, the following NixOS-specific changes were made:

  • Arnout Engelen:

    • compress-man-pages (create symlinks deterministically)
    • git (reproducible manual)
    • libseccomp (filesystem dates and ordering)
    • linux (omit build ID)
    • pytest (removed unreproducible test artifacts from the pytest package)
    • rustc (generate deterministic manifest)
    • setuptools (stable file ordering for sdist)
    • talloc (avoid Python 2.x build dependency)
  • Atemu:

    • linux (disable module signing)

Tools

diffoscope is our project in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it provides human-readable diffs from many kinds of binary format. This month, Chris Lamb made the following changes, including releasing version 163 on multiple platforms:

  • New features & bug fixes:

    • Normalise ret to retq in objdump output in order to support multiple versions of GNU binutils. (#976760)
    • Don’t show any progress indicators when running zstd. (#226)
    • Correct the grammatical tense in the --debug log output. []
  • Codebase improvements:

    • Update the debian/copyright file to match the copyright notices in the source tree. (#224)
    • Update various years across the codebase in .py copyright headers. []
    • Rewrite the filter routine that post-processes the output from readelf(1). []
    • Remove unnecessary PEP 263 encoding header lines; unnecessary after PEP 3120. []
    • Use minimal instead of basic as a variable name to match the underlying package name. []
    • Use pprint.pformat in the JSON comparator to serialise the differences from jsondiff. []

In addition, Jean-Romain Garnier added tests for OpenJDK 14. [][]

In disorderfs (our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues), Chris Lamb added support for testing on Salsa’s CI system [][][] and added a quick benchmark []. For the GNU Guix distribution, Vagrant Cascadian diffoscope to version 162 [].

Homepage/documentation updates

There were a number of updates to the main Reproducible Builds website and documentation this month, including:

  • Calum McConnell fixed a broken link. []

  • Chris Lamb applied a typo fix from Roland Clobus [], fixed the draft detection logic (#28), added more academic articles to our list [] and corrected a number of grammar issues [][].

  • Holger Levsen documented the #reproducible-changes, #debian-reproducible-changes and #archlinux-reproducible IRC channels. []

  • kpcyrd added rebuilderd and archlinux-repro to the list of tools. []

Testing framework

The Reproducible Builds project operates a large Jenkins-based testing framework that powers tests.reproducible-builds.org. This month, Holger Levsen made the following changes:

  • Debian-related changes:

    • Update code copy of debrebuild. []
    • Add Debian sid sources.list entry on a node for test rebuilds. []
    • Use focal instead of the (deprecated) eoan release for hosts running Ubuntu. []
  • Jenkins administration:

    • Show update frequency on the Jenkins shell monitor. []
    • In the Jenkins shutdown monitor, force precedence to find only log files. []
    • Update /etc/init.d script from the latest jenkins package. []
  • System health checks & notifications:

    • Detect database locks in the pacman Arch Linux package manager. []
    • Detect hosts running in the ‘wrong future’. [][]
    • Install the apt-utils package on all Debian-based hosts. []
    • Use /citests/ as the landing page. []
    • Update our debrebuild fork. []

In addition, Mattia Rizzolo made some Debian-related changes, including refreshing the GnuPG key of our repository [], dropping some unused code [] and skipping pbuilder/sdist updates on nodes that are not performing Debian-related rebuilds []. Marcus Hoffmann updated his mailing list subscription status too [].

Lastly, build node maintenance was also performed by Holger Levsen [][][], Mattia Rizzolo [] and Vagrant Cascadian [].

Upstream patches

The following patches were created this month:


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:




View all our monthly reports