Welcome to the February 2020 report from the Reproducible Builds project.
One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes.
The motivation behind the reproducible builds effort is to provide the ability to demonstrate these binaries originated from a particular, trusted, source release: if identical results are generated from a given source in all circumstances, reproducible builds provides the means for multiple third-parties to reach a consensus on whether a build was compromised via distributed checksum validation or some other scheme.
In this month’s report, we cover:
- Media coverage & upstream news — A new paper on reproducible containers, Ruby updates, etc.
- Distribution work — More work in Debian, openSUSE & friends.
- Software development — Updates and improvements to our tooling.
- Getting in touch — How to contribute, more venues for discussion.
If you are interested in contributing to the project, please visit our Contribute page on our website.
Media coverage & upstream news
Omar Navarro Leija, a PhD student at the University Of Pennsylvania, published a paper entitled Reproducible Containers that describes in detail the workings of a new user-space container tool called DetTrace:
All computation that occurs inside a DetTrace container is a pure function of the initial filesystem state of the container. Reproducible containers can be used for a variety of purposes, including replication for fault-tolerance, reproducible software builds and reproducible data analytics. We use DetTrace to achieve, in an automatic fashion, reproducibility for 12,130 Debian package builds, containing over 800 million lines of code, as well as bioinformatics and machine learning workflows.
There was also considerable discussion on our mailing list regarding this research and a presentation based on the paper will occur at the ASPLOS 2020 conference between March 16th — 20th in Lausanne, Switzerland.
The many virtues of Reproducible Builds were touted as benefits for software compliance in a talk at FOSDEM 2020, debating whether the Careful Inventory of Licensing Bill of Materials Have Impact of FOSS License Compliance which pitted Jeff McAffer and Carol Smith against Bradley Kuhn and Max Sills. (~47 minutes in).
Nobuyoshi Nakada updated the canonical implementation of the Ruby programming language a change such that filesystem globs (ie. calls to list the contents of filesystem directories) will henceforth be sorted in ascending order. Without this change, the underlying nondeterministic ordering of the filesystem is exposed to the language which often results in an unreproducible build.
Vagrant Cascadian reported on our mailing list regarding a quick reproducible test for the GNU Guix distribution, which resulted in 81.9% of packages registering as reproducible in his installation:
$ guix challenge --verbose --diff=diffoscope ... 2,463 store items were analyzed: - 2,016 (81.9%) were identical - 37 (1.5%) differed - 410 (16.6%) were inconclusive
Jeremiah Orians announced on our mailing list the release of a number of tools related to cross-compilation such as
mescc-tools-seed. This project attemps a full bootstrap of a cross-platform compiler for the C programming language (written in C itself) from hex, the ultimate goal being able to demonstrate fully-bootstrapped compiler from hex to the GCC GNU Compiler Collection. This has many implications in and around Ken Thompson’s Trusting Trust attack outlined in Thompson’s 1983 Turing Award Lecture.
Finally, Reddit user
tofflos posted to the /r/Java subreddit asking about how to achieve reproducible builds with Maven and Chris Lamb noticed that the Linux kernel documentation about reproducible builds of it is available on the kernel.org homepages in an attractive HTML format.
Chris Lamb created a merge request for the core
debian-installer package to allow all arguments and options from
sources.list files (such as “
[check-valid-until=no]”, etc.) in order that we can test the reproducibility of the installer images on the Reproducible Builds own testing infrastructure. (#13)
Thorsten Glaser followed-up to a bug filed against the
dpkg-source component that was originally filed in late 2015 that claims that the build tool does not respect permissions when unpacking tarballs if the umask is set to
Matthew Garrett posted to the
debian-devel mailing list on the topic of “Producing verifiable initramfs images” as part of a wider conversation on being able to trust the entire software stack on our computers.
59 reviews of Debian packages were added, 30 were updated and 42 were removed this month adding to our knowledge about identified issues. Many issue types were noticed and categorised by Chris Lamb, including:
python-rpm-macros(do not save time-based
.pycfiles for tests)
solfege(filesystem ordering issue sent upstream via email; package is orphaned upstream)
DVDStyler(zip timestamps, submitted upstream)
diffoscope is our in-depth and content-aware diff-like utility that can locate and diagnose reproducibility issues. It is run countless times a day on our testing infrastructure and is essential for identifying fixes and causes of nondeterministic behaviour.
Chris Lamb made the following changes this month, including uploading version
137 to Debian:
sngimage utility appears to return with an exit code of 1 if there are even minor errors in the file. (#950806)
- Also extract
.apkfiles extracted by
- No need to use
str.formatif we are just returning the string. […]
- Add generalised support for “ignoring” returncodes […] and move special-casing of returncodes in zip to use
disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues. This month, Vagrant Cascadian updated the
Vcs-Git to specify the
debian packaging branch. […]
reprotest is our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences. This month, versions
0.7.14 were uploaded to Debian unstable by Holger Levsen after Vagrant Cascadian added support for GNU Guix […].
Project documentation & website
There was more work performed on our documentation and website this month. Bernhard M. Wiedemann added a Java Gradle Build Tool snippet to the
SOURCE_DATE_EPOCH documentation […] and normalised various terms to “unreproducible” […].
Chris Lamb added a Meson.build example […] and improved the documentation for the CMake […] to the
SOURCE_DATE_EPOCH documentation, replaced “anyone can” with “anyone may” as, well, not everyone has the resources, skills, time or funding to actually do what it refers to […] and improved the pre-processing for our report generation […][…][…][…] etc.
In addition, Holger Levsen updated our news page to improve the list of reports […], added an explicit mention of the weekly news time span […] and reverted sorting of news entries to have latest on top […] and Mattia Rizzolo added Codethink as a non-fiscal sponsor […] and lastly Tianon Gravi added a Docker Images link underneath the “Debian” project on our “Projects” page […].
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
Bernhard M. Wiedemann (for the openSUSE distribution):
- #943956 re-opened against
- #950630 filed against
- #950936 filed against
- #950942 filed against
- #951357 filed against
- #951573 filed against
- #952493 filed against
- #952694 filed against
- #952762 filed against
- ccextractor — issues with MacOSX BSD date not understanding GNU date options
python-django(Always build the documentation in English)
- #943956 re-opened against
- #950410 filed against
- #950417 & 950416 & 950415 filed against
- #950419 filed against
- #950444 filed against
- #950585 filed against
- #950603 filed against
- #950606 filed against
- #950704 filed against
- #951031 filed against
- Configured an instance of David Bremner’s
.buildinfofiles. This has resulted in so that we now know that Debian bullseye contains 4,557 source packages for the
amd64architecture without corresponding
.buildinfofiles and 25,668 source packages with
- Forward mails addressed to
- Temporarily revert using backports […][…] and use
devscriptsfrom buster-backports […].
- Update URL for PureOS package set. […]
- Deprioritise the scheduling of older old packages in bullseye and unstable. […][…]
- Treat the bullseye distribution like unstable when scheduling the
i386architecture, to allow the latter to catch up a bit. […]
- Configured an instance of David Bremner’s
- Disable the last active Alpine builder too for now. […]
- Improve generated HTML output. […][…]
- Ignore Fedora jobs. […]
- Include links to failed and “unstable” jobs in HTML output. […]
- Start weighting job importance and five a lot more weight to nodes acting as a proxy for others. […][…][…]
- Add new job to calculate the overall system health for tests.reproducible-builds.org for usage with Jelle van der Waa’s Reproducible Builds status display. […]
In addition, Mattia Rizzolo added an Apache web server redirect for buildinfos.debian.net […] and reverted the reshuffling of
arm64 architecture builders […]. The usual build node maintenance was performed by Holger Levsen, Mattia Rizzolo […][…] and Vagrant Cascadian.
Getting in touch
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
This month’s report was written by Bernhard M. Wiedemann, Chris Lamb and Holger Levsen. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.