Welcome to the April 2021 report from the Reproducible Builds project!
In these reports we try to the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. If you are interested in contributing to the project, please visit our Contribute page on our website.
A preprint of a paper by Chris Lamb and Stefano Zacchiroli (which will shortly appear in IEEE Software) has been made available on the arXiv.org service. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains (PDF), the abstract of the paper contains the following:
We first define the problem, and then provide insight into the challenges of making real-world software build in a “reproducible” manner-this is, when every build generates bit-for-bit identical results. Through the experience of the Reproducible Builds project making the Debian Linux distribution reproducible, we also describe the affinity between reproducibility and quality assurance (QA). […]
Elsewhere on the internet, Igor Golovin on the Kaspersky security blog reported that APKPure, an alternative app store for Android apps, began to distribute “an advertising SDK from an unverified source [that] turned out to be malicious” […]. Elaborating elsewhere on the internet, Igor wrote that the malicious code had “much in common with the notorious Triada malware and can perform a range of actions: from displaying and clicking ads to signing up for paid subscriptions and downloading other malware”.
Closer to home, Jeremiah Orians wrote to our mailing list reporting that it is now possible to bootstrap the GCC compiler without using the pre-generated Bison grammar files, part of a broader attempt to provide a “reproducible, automatic [and] complete end-to-end bootstrap from a minimal number of binary seeds to a supported fully functioning operating system” […]. In addition, Richard Clobus started a thread on potential problems the
-Wl,--build-id=sha1 linker flag which can later be used when analysing core dumps and tracebacks. According to the Red Hat Customer Portal:
Each executable or shared library built with Red Hat Enterprise Linux Server 6 or later is assigned a unique identification 160-bit SHA-1 string, generated as a checksum of selected parts of the binary. This allows two builds of the same program on the same host to always produce consistent build-ids and binary content. (emphasis added)
Lastly, Felix C. Stegerman reported on the latest release of
apksigcopier is a tool to copy, extract and patch
.apk signatures that is needed to facilitate reproducible builds on the F-Droid Android application store and elsewhere. Holger Levsen subsequently sponsored an upload to Debian.
An issue was discovered in Arch Linux regarding packages that where previously considered reproducible. After some investigation, it was determined that the build’s
CFLAGS could vary between two previously ‘reproducible’ builds
The cause was attributed the fact that in Arch Linux, the devtools package determines the build configuration, but in the development branch it had been inadvertently copying the
makepkg.conf file from the
pacman package — the
devtools version had been fixed in the recent release. This meant that when Arch Linux released or releases a
devtools package with updated CFLAGS` (or similar), old packages could fail to build reproducibly as they would be reproduced in a different build environment.
To address this problem, Levente Polyak sent a patch to the
pacman mailing list to include the version of
devtools in the relevant
BUILDINFO file. This means that the
repro tool can now install the corresponding
makepkg.conf file when attempting to validate a reproducible build.
In Debian, Frédéric Pierret continued working on
debian.notset.fr, a partial copy of the
snapshot.debian.org “wayback machine” service for the Debian archive that is limited to the packages needed to rebuild the bullseye distribution on the
amd64 architecture. This is to workaround some perceived limitations of
snapshot.debian.org. Since last month, the service covers from mid-2020 onwards, and request was made to the Debian sysadmin team to obtain better access to
snapshot.debian.org in order to further accelerate the initial seeding. In addition, the service supports now more endpoints in the API (full documentation), including a
timestamp endpoint to track the sync in a machine-readable way.
Twenty-one reviews of Debian packages were performed, nine were updated and sixteen were removed this month adding to our large taxonomy of identified issues. A number of issue types have been updated too, including removing the
random_order_in_javahelper_substvars issue type […], but also the addition of a new
timestamps_in_pdf_generated_by_libreoffice toolchain issue by Chris Lamb […].
Lastly, Bernhard M. Wiedemann posted his monthly reproducible builds status report for the openSUSE distribution.
Bernhard M. Wiedemann:
librsb(memory layout issue)
- Verified openSUSE Leap 15.3 and SLES-15-SP3 binaries, and submitted several reproducibility fixes.
- #971527 filed against
- #986877 filed against
- #971527 filed against
diffoscope is the Reproducible Builds project in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it provides human-readable diffs from many kinds of binary formats. This month, Chris Lamb made a number of changes including releasing version 172 and version 173:
- Add support for showing annotations in PDF files. (#249)
- Move to the
assert_diffhelper in `test_pdf.py. […]
In addition, Mattia Rizzolo attempted to make the testsuite pass with
file(1) version 5.40 […] and Zachary T. Welch updated the
__init__ magic method of the
Difference class to demote the
unified_diff argument to a Python ‘kwarg’ […].
Website and documentation
Quite a few changes were made to the main Reproducible Builds website and documentation this month, including:
- Highlight our mailing list on the Contribute. page […]
- Add a noun (and drop an unnecessary full-stop) on the landing page. […][…]
- Correct a reference to the
datemetadata attribute on reports, restoring the display of months on the homepage. […]
- Correct a typo of “instalment” within a previous news entry. […]
- Added a conspicuous “draft” banner to unpublished blog posts in order to match the report draft banner. […]
The Reproducible Builds project operates a Jenkins-based testing framework that powers
tests.reproducible-builds.org. This month, the following changes were made:
READMEto reflect that Debian buster is now the ‘stable’ distribution. […]
- Support fully-qualified domain names in the powercycle script for the
- Improve the handling of node names (etc.) for
- Improve the detection and classification of packages maintained by the Debian accessibility team. […][…]
- Count the number configured armhf nodes correctly by ignoring comments at the end of line. […]
- Fix a regular expression in automatic log-parsing routines. […]
- Add some new
armhfarchitecture build nodes,
armhfbuild jobs to only use active nodes. […]
- Add a health check for broken OpenSSH ports on
- Mark which
armhfarchitecture jobs are not systematically varying 32-bit and 64-bit kernels. […]
- Disable creation of Debian stretch build tarballs and update the
READMEfile to mention bullseye instead. […]
- Add some new
Finally, build node maintenance was performed by Holger Levsen […][…][…][…], Mattia Rizzolo […] and Vagrant Cascadian […] […][…].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
Twitter (@ReproBuilds) & Mastodon (@firstname.lastname@example.org)