Welcome to the May 2019 report from the Reproducible Builds project! In our reports we outline the most important things which we have been up to in and around the world of reproducible builds & secure toolchains over the past month.
As a quick recap, whilst anyone can inspect the source code of free software for malicious flaws, almost all software is distributed to end users pre-compiled. The motivation behind reproducible builds effort is to ensure no malicious flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing third-parties to come to a consensus on whether a build was compromised.
In this month’s report, we will cover:
- Media coverage — More supply chain attacks, Reproducible Builds at conferences, etc.
- Upstream news — Mozilla updates their add-on policy, etc.
- Distribution work — Debian Installer progress, openSUSE updates.
- Software development — A try.diffoscope.org rewrite, more upstream patches, etc.
- Misc news — From our mailing list, etc.
- Getting in touch — How to contribute, contact details, etc.
If you are interested in contributing to our project, please visit our Contribute page on our website.
Media coverage
-
Adam Greenberg reported on Wired about the “mysterious hacker group” Barium, detailing a single group of malicious actors who appear responsible for a variety supply chain attacks of CCleaner, Asus and more, planting backdoors on & gaining access to millions of end-user machines.
-
The work of Chris Lamb in/around Debian’s Reproducible Builds effort was awarded a Google Open Source Peer Bonus award, a program with the goal of recognising and supporting the ecosystem and sustainability of free software by recognising developers for their contributions to open source projects.
-
Kushal Das presented at PyCon 2019 on building reproducible Python applications for secured environments. Here, Kushal argues that validating the dependencies of the project is as critical as actual project source code, referring to incidents where actors were able to steal bitcoins using a popular library. His talk uses the SecureDrop client application for journalists as an example project to see how to tackle the more general problem.
-
GitHub announced adding a package registry feature which “suggest but alas not guarantee” a strong link between the Git repository and the published packages, highlighting the need for Reproducible Builds in this area.
-
Andrew Martin has published his slides for his talk entitled Rootless, Reproducible and Hermetic: Secure Container Build Showdown that he gave at KubeCon 2019.
Upstream news
The IPFS “Package Managers Special Interest Group” is gathering research around package management, much of which is relevant to the Reproducible Builds effort.
Atharva Lele plans to work on reproducible builds for the Buildroot embedded Linux project as part of Google Summer of Code, ensuring that two instances of buildroot running with the same configuration for the same device yield the same result.
Mozilla’s latest update to the Firefox add-on policy now dictates that add-ons may contain “transpiled, minified or otherwise machine-generated code” but Mozilla needs to review a copy of the human-readable source code. The author must provide this information to Mozilla during submission along with instructions on how to reproduce the build.
Distribution work
Bernhard M. Wiedemann posted his monthly Reproducible Builds status update for the openSUSE distribution.
Holger Levsen filed a wishlist request requesting that Debian’s .buildinfo
build environment specification documents from the Debian Long Term Support (LTS) project are also distributed by the build/archive infrastructure so that the reproducibility status of these security packages can be validated.
There was yet more progress towards making the Debian Installer images reproducible. Following-on from last months, Chris Lamb performed some further testing of the generated images and requested a status update which resulted in a call for testing the possible removal of a now-obsolete workaround that is hindering progress.
68 reviews of Debian packages were added, 30 were updated and 11 were removed this month, adding to our knowledge about identified issues. Chris Lamb discovered, identified and triaged two new issue types, the first identifying randomness in Fontconfig .uuid
files […] and another randomness_in_output_from_perl_deparse
.
Finally, GNU Guix announced its 1.0.0 release.
Software development
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream wherever possible. This month, we wrote a large number of such patches, including:
-
Arnout Engelen authored a pull request to make the binary of the Notion window manager reproducible.
- Bernhard M. Wiedemann:
- dvdstyler (
.zip
ctime) - fs-uae (already filed upstream; zip order, date/
mtime
) - gettext-runtime (Use the
SOURCE_DATE_EPOCH
environment variable) - gnome-builder (Drop
environment.pickle
file) - mrrescue (
zip -X
modification time) - mvapich2 (Sort
readdir(2)
call, already filed upstream) - nulloy (
.zip
timestamps) - osc (Dependency bug hindering openSUSE reproducible builds)
- pithos (make
.pyc
files not vary from architecture) - plata-theme (zip
mtime
) - python-Fabric3 (Workaround FTBFS
-j1
) - python-keystonemiddleware (Make tests pass in 2020)
- python-nbconvert (Fails to build in single-process,
-j1
, mode) - python-ovirt-engine-sdk (Sort input file list)
- python-requests-toolbelt (Does not build in the year 2021)
- python-rjsmin (Disable profiling)
- python3-saml (Does not build in the year 2020)
- zip (Add
SOURCE_DATE_EPOCH
clamping of modification times; also submitted upstream and in distropatches)
- dvdstyler (
- Chris Lamb:
Finally, Vagrant Cascadian submitted a patch for u-boot boot loader fixing reproducibility when building a new type of compressed image. This was subsequently merged in version 2019.07-rc2
.
diffoscope
diffoscope is our in-depth “diff-on-steroids” utility which helps us diagnose reproducibility issues in packages. It does not define reproducibility, but rather provides a helpful and human-readable guidance for packages that are not reproducible, rather than relying essentially-useless diffs.
-
Chris Lamb:
-
Support the latest PyPI package repository upload requirements by using real reStructuredText comments instead of the
raw
directive […] and by stripping out manpage-only parts of theREADME
rather than using theonly
directive […]. -
Fix execution of symbolic links that point to the
bin/diffoscope
entry point in a checked-out version of our Git repository by fully resolving the location as part of dynamically calculating Python’s module include path. […] -
Add a Dockerfile […] with various subsequent fixups […][…][…].
-
Published the resulting Docker image in diffoscope’s container registry and updated the diffoscope homepage to provide “quick start” instructions on how to use diffoscope via this image.
-
-
Mattia Rizzolo:
- Uploaded version
115
to Debian experimental. - Adjust various build and test-dependencies, including specifying the ffmpeg video encoding tool/library and the Black code formatter […] in the build-dependencies […] and reinstating the oggvideotools and
procyon-decompiler
as test dependencies, now that are no-longer buggy […], etc. - Make the Debian autopkgtests not fail when a limited subset of “required tools” are temporarily unavailable. […][…][…]
- Uploaded version
In addition, Santiago Torres altered the behaviour of the tests to ensure compatibility with various versions of file(1)
[…] and Vagrant Cascadian added support for various external tools in GNU Guix […] and updated the version of diffoscope in that distribution […].
try.diffoscope.org
Chris Lamb made a large number of following changes to the web-based (“no installation required”) version of the diffoscope tool, try.diffoscope.org:
-
Ported the entire site to Python 3 and Django 2.x as Python 2.x is due for deprecation. This required updates to a huge number of parts around the site including but not limited to completely reconfiguring and integrating the Celery queue processor, all the string formatting, etc.
-
Moved to using the published/public Docker image to execute builds instead rolling our own container.
-
Updated and upgraded the underlying operating system to the Debian stable distribution.
-
Moved the canonical Git repository from GitHub to the Reproducible Builds group on salsa.debian.org, requiring moving to GitLab’s own continuous integration (CI) support from Travis CI, working around the aggressive firewall (exclusively outgoing ports 80/443) applied to the Salsa-based CI runners.
-
Avoid having to update the Let’s Encrypt-provided SSL certificate manually every 90 days by moving to using Certbot in
--auto
mode.
Test framework
We operate a comprehensive Jenkins-based testing framework that powers tests.reproducible-builds.org. The following changes were done in the last month:
-
Holger Levsen made the following (Debian-related changes):
- Reduce the number of
cron(8)
mails for synchronising.buildinfo
files from eight to one per day. […] - Run
rsync2buildinfos.debian.net
script every other hour now that it just produces one mail per day. […][…] - Execute the package scheduler every 2 hours (instead of 3). […]
- Switch the Codethink and OSUOSL nodes to use our updated email relay system. […][…]
- Deal with the (rare) cases of
.buildinfo
files with the same name. […][…] - Save and mail the package scheduler results once a day instead of mailing ~8 times a day. […]
- Reduce the number of
-
In addition, Holger Levsen made the following distribution-agnositic changes:
-
Mattia Rizzolo:
- Use a special code so that remote builds can abort themselves by passing back the command to the “master”. […][…][…][…]
- Fix a pattern matching bug to ensure all “zombie” processes are found. […]
- flake8 the
chroot-installation.yaml.py
file. […] - Set a known HTTP User Agent for Git, so that server can recognise us. […]
- Allow network access for the
debian-installer-netboot-images
Debian package. […]
Finally, Vagrant Cascadian removed the deprecated --buildinfo-id
from the pbuilder(8)
configuration. […] and Holger Levsen […][…][…][…][…][…]
Mattia Rizzolo […] and Vagrant Cascadian all performed a large amount of build node maintenance, system & Jenkins administration.
Project website
Chris Lamb added various fixes for larger/smaller screens […], added a logo suitable for printing physical pin badges […] and refreshed the opening copy text on our SOURCE_DATE_EPOCH
page.
Bernhard M. Wiedemann then documented a more concise C code example for parsing the SOURCE_DATE_EPOCH
environment variable […][…] and Holger Levsen added a link to a specific bug blocking progress in openSUSE to our Who is involved? page […].
Misc news
- On our mailing list this month Lars Wirzenius asked various questions about reproducible builds and their bearing on building a distributed continuous integration system which received many replies (view thread index).
-
The server powering
lists.reproducible-builds.org
changed home. Thanks topotager.org
for hosting us all this time and many thanks to Profitbricks for hosting our new mail server as well as all the other nodes over the years. -
Mo Zhou wrote a detailed policy for deep learning software for the Debian distribution which touches on the reproducibility of data models.
Lastly, Sam Hartman, the current Debian Project Leader, wrote on the debian-devel
mailing list:
The reproducible builds world has gotten a lot further with bit-for-bit identical builds than I ever imagined they would. […]
Thanks, Sam!
Getting in touch
If you are interested in contributing the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
-
IRC:
#reproducible-builds
onirc.oftc.net
. -
Twitter: @ReproBuilds
-
Mailing list:
rb-general@lists.reproducible-builds.org
This month’s report was written by Arnout Engelen, Bernhard M. Wiedemann, Chris Lamb, Holger Levsen, Mattia Rizzolo and Vagrant Cascadian & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.