Welcome to latest report from the Reproducible Builds project for June 2021. In these reports we outline the most important things that have been happening in the world of reproducible builds in the past month. As ever, if you are interested in contributing to the project, please visit the Contribute page on our website.
The specter of more events like the SolarWinds supply-chain attacks is something that concerns many in our communities—and beyond. Linux distributions provide a supply chain that obviously needs to be protected against attackers injecting malicious code into the update stream. This problem recently came up on the Fedora devel mailing list, which led to a discussion covering a few different topics. For the most part, Fedora users are protected against such attacks, which is not to say there is nothing more to be done, of course.
The Google Security Blog introduced a new framework called “Supply chain Levels for Software Artifacts”, or SLSA (to be pronounced as ‘salsa’). In particular, SLSA level 4 (“currently the highest level”) not only requires a two-person review of all changes but also “a hermetic, reproducible build process” due to its “many auditability and reliability benefits”. Whilst a highly welcome inclusion in Google’s requirements, by equating reproducible builds with only the highest level of supply-chain security in their list, it might lead others to conclude that only the most secure systems can benefit from the benefits of reproducible builds, whilst it is a belief of the Reproducible Builds project that many more users, if not all, can do so.
Many media outlets (including The Verge, etc.) reported on how the United States’ FBI operated a messaging app as a ‘honeypot trap’ for a long period of time, leading to hundreds of arrests. According to the UK’s Financial Times, court documents describe how the FBI persuaded a software developer facing prison to allow the FBI to commandeer the app and to introduce it to suspected criminals:
Over the course of the next three years, the operation was able to inspect about 27m messages over 11,800 devices as ANOM gained popularity in criminal circles globally, pushed by the developer but also a network of crime “influencers” — experts in encrypted phones who encourage others to use such devices.
As the Financial Times reports, “it is unclear what exactly prompted the FBI and others to reveal the operation”, although others have suggested it may result from legal limits in timeframes for intercepting communications. The FBI’s operation raises ethical concerns which overlap with beliefs held by proponents of Reproducible Builds, not least of all because even the most unimpeachable actions by actors may result in the incidental surveillance of innocent people.
In similar legal news, Susan Landau posted to the Lawfare blog about the potential dangers posted by evidentiary software. In particular, she discusses concerns that proprietary software may be fundamentally incompatible with the ability of defendants have the right to know the nature of the evidence against them — this is a right that is explicitly enshrined, for instance, in the Sixth Amendment of United States Constitution. However,
At the time of our writing the article on the use of software as evidence, there was no overriding requirement that [United States] law enforcement provide a defendant with the code so that they might examine it themselves.
It is relevant here because if the inability to consult the relevant source code of does violate such rights, it may follow that a secure and reproducible build process will also be required — after all, it would be the output of the binary versions of the source code that is used to convict suspects, not the source code itself. As Susan points out:
Mistakes happen with software and sometimes the only way to find errors is to study the code itself—both of which have important implications for courtroom use of software programs.
The Reproducible Builds project restarted their IRC meetings this month. Taking place on the
#reproducible-builds channel on the OFTC IRC network, the log of the meeting on 29th June is now available online, and the next meeting is due to take place on July 27th at 15:00 UTC (agenda).
Ars Technica are reporting that “counterfeit” packages in PyPI, the official Python package repository, contained secret code that installed cryptomining software on infected machines: “So-called typosquatting attacks succeed when targets accidentally mistype a name such as typing mplatlib or maratlib instead of the legitimate and popular package, matplotlib”. The article is at pains to points out that PyPI is not not abused any more than other repositories are:
Last year, packages downloaded thousands of times from RubyGems installed malware that attempted to intercept bitcoin payments. Two years before that, someone backdoored a 2-million-user code library hosted in NPM. Sonatype has tracked more than 12,000 malicious NPM packages since 2019.
Ariadne Conill published a detailed blog post this month detailing their work on security issues and concerns in the Alpine Linux distribution. In particular, Ariadne included an interesting section on an effort “to prove the reproducibility of Alpine package builds”:
To this end, I hope to have the Alpine 3.15 build fully reproducible. This will require some changes to
abuildso that it produces
buildinfo files, as well as a rebuilder backend. We plan to use the same buildinfo format as Arch Linux, and will likely adapt some of the other reproducible builds work Arch has done to Alpine.
Ariadne mentioned plans to have a meeting and a sprint during July, to be organised in and around the
#alpine-reproducible channel on the OFTC IRC network, and later posted a round-up of security initiatives in Alpine during June which mentions, amongst many other things, the ability to demonstrate reproducible Alpine install images for the Raspberry Pi.
For openSUSE, Bernhard M. Wiedemann posted his monthly reproducible builds status report.
The NixOS Linux distribution pulled off a technical and publicity coup this month by announcing that the
ISO_minimal.x86_64-Linux image is 100% reproducible. The announcement was widely discussed on Hacker News, where the article has received in excess of 200 comments.
In early June, Nilesh Patra asked for help making Debian’s
brian package build reproducibly. Felix C. Stegerman proposed two patches which seem to have fixed the remaining issues (#989693). These were submitted upstream, where they were shortly merged.
Felix C. Stegerman announced the release of v1.0.0 of apksigcopier, a tool to copy, extract and patch
.apk signatures needed to facilitate reproducible builds on the F-Droid Android application store. Holger Levsen subsequently sponsored an upload to Debian. Felix C. Stegerman also reported that Android builds are sometimes not reproducible due to a bug in Android’s
Elsewhere in F-Droid, the Swiss COVID Certificate mobile app (which uses reproducible builds) has been added to F-Droid — the F-Droid developers have mentioned that the upstream developers have been very helpful in making this happen. Relatedly, the Android version of the Electrum Bitcoin Wallet has been made reproducible.
Lastly, Hannes Mehnert announced the launch of the reproducible MirageOS build infrastructure, together with where to obtain ‘unikernels’: “To provide a high level of assurance and trust, if you distribute binaries in 2021, you should have a recipe how they can be reproduced in a bit-by-bit identical way.”
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
Bernhard M. Wiedemann:
deepdiff(report a ‘build failure in 2022’ issue)
dulwich(build fails in the future due to expired GPG key)
gtksourceview4(report that build fails in uniprocessor machine)
ar(1)call needs to be deterministic)
json-lib(report a date / epoch issue)
kernel-default(two sorting and random-related issues)
lepton(drop call to
lighttpd1(build fails in 2036)
openvas-smb(date and Portable Executable timestamp issue)
python-MapProxy(report a ‘build fails on uniprocessor machine’ issue)
python-gcsfs(report a ‘build fails on uniprocessor machine’ issue)
- #989963 filed against
- #989965 filed against
- #989966 filed against
- #990084 filed against
- #990246, #990247 and #990248 filed against
- #990253 filed against
- #990254 filed against
- #990300 filed against
- #990323 filed against
- #990327 filed against
- #990329 filed against
- #990332 filed against
- #990338 filed against
- #990339 filed against
- #989963 filed against
Separate to this, Hans-Christoph Steiner noted there is a reproducibility-related bug in Python’s standard
zipfile library. This problem makes it hard to create reproducible
.zip files. In particular, Hans would like to have more input from Python people, since it is not clear how best to resolve the problem.
diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it provides human-readable diffs from many kinds of binary formats.
This month, Chris Lamb made a number of changes including releasing version 177). In addition, Chris updated the try.diffoscope.org service to reflect that Bytemark were acquired by the Iomart Group. […].
- Overhaul the Mach-O executable file comparator. […][…][…][…][…]
- Implement tests for the Mach-O comparator. […][…][…]
- Switch to new argument format for the LLVM compiler. […]
test_libmix_differencesin testsuite for the ELF format. […][…]
- Improve macOS compatibility for the Mach-O comparator. […]
llvm-objdumpto the internal
EXTERNAL_TOOLSdata structure. […]
Website and documentation
A number of few changes were made to the main Reproducible Builds website and documentation this month, including:
- Use an ellipsis […] and drop a full stop […] to clarify ‘more items’ links.
- Update the link and logo to Google Open Source Security Team. […]
- Reduce the amount of bold text on the homepage. […]
- Document the non-reproducibility arising from abbreviated Git hashes depending on the number of total objects in a Git repository. […]
- Initial stab at building and comparing Debian Live images. […]
- Run the
lb buildDebian Live command with
- Use safer and more common
rm -rfsyntax in/around Debian Live images. […]
- Sync build results of Live images to our Jenkins instance. […]
- Create a Debian unstable schroot for running diffoscope on the
osuosl173node so it can be used to test Debian Live images. […]
- Cope with the Tails build manifests now only containing binary package names. […]
- Do not incorrectly detect diskspace issues on OpenSSL builds. […]
- Delete the
Automatic node health check improvements:
Misc development news
Here at LumoSQL we do repeated runs testing SQLite of various versions and configurations, storing the results in an SQLite database. Here is an example of the kind of variation that justifies what some have called our ‘too-fussy’ test suite, a microcode update that changes behaviour from one day to another.
Finally, in last month’s report we wrote about Paul Spooren proposing a patch for the BusyBox suite of UNIX utilities so that it uses
SOURCE_DATE_EPOCH for build timestamps if available. This was merged during June by Denys Vlasenko.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via: