Welcome to the first report for 2023 from the Reproducible Builds project!
In these reports we try and outline the most important things that we have been up to over the past month, as well as the most important things in/around the community. As a quick recap, the motivation behind the reproducible builds effort is to ensure no malicious flaws can be deliberately introduced during compilation and distribution of the software that we run on our devices. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.
In a curious turn of events, GitHub first announced this month that the checksums of various Git archives may be subject to change, specifically that because:
… the default compression for Git archives has recently changed. As result, archives downloaded from GitHub may have different checksums even though the contents are completely unchanged.
This change (which was brought up on our mailing list last October) would have had quite wide-ranging implications for anyone wishing to validate and verify downloaded archives using cryptographic signatures. However, GitHub reversed this decision, updating their original announcement with a message that “We are reverting this change for now. More details to follow.” It appears that this was informed in part by an in-depth discussion in the GitHub Community issue tracker.
The Bundesamt für Sicherheit in der Informationstechnik (BSI) (trans: ‘The Federal Office for Information Security’) is the agency in charge of managing computer and communication security for the German federal government. They recently produced a report that touches on attacks on software supply-chains (Supply-Chain-Angriff). (German PDF)
Noak Jönsson has written an interesting paper entitled The State of Software Diversity in the Software Supply Chain of Ethereum Clients. As the paper outlines:
In this report, the software supply chains of the most popular Ethereum clients are cataloged and analyzed. The dependency graphs of Ethereum clients developed in Go, Rust, and Java, are studied. These client are Geth, Prysm, OpenEthereum, Lighthouse, Besu, and Teku. To do so, their dependency graphs are transformed into a unified format. Quantitative metrics are used to depict the software supply chain of the blockchain. The results show a clear difference in the size of the software supply chain required for the execution layer and consensus layer of Ethereum.
Yongkui Han posted to our mailing list discussing making reproducible builds & GitBOM work together without gitBOM-ID embedding. GitBOM (now renamed to OmniBOR) is a project to “enable automatic, verifiable artifact resolution across today’s diverse software supply-chains” […]. In addition, Fabian Keil wrote to us asking whether anyone in the community would be at Chemnitz Linux Days 2023, which is due to take place on 11th and 12th March (event info).
Separate to this, Akihiro Suda posted to our mailing list just after the end of the month with a status report of bit-for-bit reproducible Docker/OCI images. As Akihiro mentions in their post, they will be giving a talk at FOSDEM in the ‘Containers’ devroom titled Bit-for-bit reproducible builds with
Dockerfile and that “my talk will also mention how to pin the apt/dnf/apk/pacman packages with my
The extremely popular Signal messenger app added upstream support for the
SOURCE_DATE_EPOCH environment variable this month. This means that release tarballs of the Signal desktop client do not embed nondeterministic release information. […][…]
F-Droid & Android
There was a very large number of changes in the F-Droid and wider Android ecosystem this month:
On January 15th, a blog post entitled Towards a reproducible F-Droid was published on the F-Droid website, outlining the reasons why “F-Droid signs published APKs with its own keys” and how reproducible builds allow using upstream developers’ keys instead. In particular:
In response to […] criticisms, we started encouraging new apps to enable reproducible builds. It turns out that reproducible builds are not so difficult to achieve for many apps. In the past few months we’ve gotten many more reproducible apps in F-Droid than before. Currently we can’t highlight which apps are reproducible in the client, so maybe you haven’t noticed that there are many new apps signed with upstream developers’ keys.
F-Droid added 13 apps published with reproducible builds this month. […]
FC Stegerman outlined a bug where
baseline.profmfiles are nondeterministic, developed a workaround, and provided all the details required for a fix. As they note, this issue has now been fixed but the fix is not yet part of an official Android Gradle plugin release.
FC Stegerman also announced the
0.2.1releases of reproducible-apk-tools, a suite of tools to help make
.apkfiles reproducible. Several new subcommands and scripts were added, and a number of bugs were fixed as well […][…]. They also updated the F-Droid website to improve the reproducibility-related documentation. […][…]
On the F-Droid issue tracker, FC Stegerman discussed reproducible builds with one of the developers of the Threema messenger app and reported that Android SDK build-tools
32.0.0(unlike earlier and later versions) have a
zipaligncommand that produces incorrect padding.
A number of bugs related to reproducibility were discovered in Android itself. Firstly, the non-deterministic order of
.apkfiles […] and then newline differences between building on Windows versus Linux that can make builds not reproducible as well. […] (Note that these links may require a Google account to view.)
And just before the end of the month, FC Stegerman started a thread on our mailing list on the topic of hiding data/code in APK embedded signatures which has been made possible by the Android APK Signature Scheme v2/v3. As part of this, they made an Android app that reads the APK Signing block of its own APK and extracts a payload in order to alter its behaviour called sigblock-code-poc.
As mentioned in last month’s report, Vagrant Cascadian has been organising a series of online sprints in order to ‘clear the huge backlog of reproducible builds patches submitted’ by performing NMUs (Non-Maintainer Uploads). During January, a sprint took place on the 10th, resulting in the following uploads:
Elsewhere in Debian, strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. This month, version
1.13.1-1 was uploaded to Debian unstable by Holger Levsen, including a fix by FC Stegerman (obfusk) to update a regular expression for the latest version of
file(1) […]. (#1028892)
Lastly, 65 reviews of Debian packages were added, 21 were updated and 35 were removed this month adding to our knowledge about identified issues.
In other distributions:
Bernhard M. Wiedemann published another monthly report for reproducibility within openSUSE, as well as a belated report for December 2022.
Finally, an existing tool called rpmreproduce was (re-)discovered this month, which claims that “given a buildinfo file from a RPM package, [it can] generate instructions for attempting to reproduce the binary packages built from the associated source and build information.”
diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb made the following changes to diffoscope, including preparing and uploading versions
234 to Debian:
- No need for
from __future__ import print_functionimport anymore. […]
- Comment and tidy the
- Split inline Python code to generate test
Recommendsinto a separate Python script. […]
debian/tests/controlafter merging support for PyPDF support. […]
- Correctly catch segfaulting
- Drop some old debugging code. […]
- Allow ICC tests to (temporarily) fail. […]
In addition, FC Stegerman (obfusk) made a number of changes, including:
- Updating the
test_text_proper_indentationtest to support the latest version(s) of
- Use an
extras_require.jsonfile to store some build/release metadata, instead of accessing the internet. […]
- Updating an APK-related
file(1)regular expression. […]
- On the diffoscope.org website, de-duplicate contributors by e-mail. […]
The Reproducible Builds project attempts to fix as many currently-unreproducible packages as possible. This month, we wrote a large number of such patches, including:
Bernhard M. Wiedemann:
asyncpg(fails to build in 2032)
cpython(fails to build in 2038)
django(fails to build in 2038)
libarchive(fails to build in 2038)
libarchive(fails to build in 2038)
mbedtls(fails to build in 2023)
mozilla-nss(fails to build in 2023)
ocaml-rpm-macros(fix fallout from an RPM-related change)
perl HTTP::Cookies(fails to build in 2038)
python-aiosmtplib/python-trustme(fails to build in 2038 due to SSL certificate)
python-bmap(fails to build in 2024)
python-compileall2(fails to build in 2038)
python-tasklibfails to build in 2038)
taskwarrior(fix fails to build in 2038)
wrk(hash ordering issue)
xemacs(fails to build in 2038 stuck)
The Reproducible Builds project operates a comprehensive testing framework at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In January, the following changes were made by Holger Levsen:
- Update the
chroot-runscript to correctly manage
- Update the Jenkins ‘shell monitor’ script to collect disk stats less frequently […] and to include various directory stats. […][…]
- Update the ‘real’ year in the configuration in order to be able to detect whether a node is running in the future or not. […]
- Bump copyright years in the default page footer. […]
- Update the
In addition, Christian Marangi submitted a patch to build OpenWrt packages with the
V=s flag to enable debugging. […]
If you are interested in contributing to the Reproducible Builds project, please visit the Contribute page on our website. You can get in touch with us via: