Welcome to the July 2024 report from the Reproducible Builds project!
In our reports, we outline what we’ve been up to over the past month and highlight news items in software supply-chain security more broadly. As always, if you are interested in contributing to the project, please visit our Contribute page on our website.
Table of contents:
- Reproducible Builds Summit 2024
- Pulling Linux up by its bootstraps
- Towards Idempotent Rebuilds?
- AROMA: Automatic Reproduction of Maven Artifacts
- Community updates
- Android Reproducible Builds at IzzyOnDroid with
rbtlog
- Extending the Scalability, Flexibility and Responsiveness of Secure Software Update Systems
- Development news
- Website updates
- Upstream patches
- Reproducibility testing framework
Reproducible Builds Summit 2024
Last month, we were very pleased to announce the upcoming Reproducible Builds Summit, set to take place from September 17th — 19th 2024 in Hamburg, Germany. We are thrilled to host the seventh edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin and Athens. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving.
If you’re interesting in joining us this year, please make sure to read the event page, which has more details about the event and location. We are very much looking forward to seeing many readers of these reports there.
”Pulling Linux up by its bootstraps” (LWN)
In a recent edition of Linux Weekly News, Daroc Alden has written an article on “bootstrappable” builds. Starting with a brief introduction that…
… a bootstrappable build is one that builds existing software from scratch — for example, building GCC without relying on an existing copy of GCC. In 2023, the Guix project announced that the project had reduced the size of the binary bootstrap seed needed to build its operating system to just 357-bytes — not counting the Linux kernel required to run the build process.
The article goes onto to describe that “now, the live-bootstrap project has gone a step further and removed the need for an existing kernel at all.” and concludes:
The real benefit of bootstrappable builds comes from a few things. Like reproducible builds, they can make users more confident that the binary packages downloaded from a package mirror really do correspond to the open-source project whose source code they can inspect. Bootstrappable builds have also had positive effects on the complexity of building a Linux distribution from scratch […]. But most of all, bootstrappable builds are a boon to the longevity of our software ecosystem. It’s easy for old software to become unbuildable. By having a well-known, self-contained chain of software that can build itself from a small seed, in a variety of environments, bootstrappable builds can help ensure that today’s software is not lost, no matter where the open-source community goes from here
Towards Idempotent Rebuilds?
Trisquel developer Simon Josefsson wrote an interesting blog post comparing the output of the .deb
files from our tests.reproducible-builds.org testing framework and the ones in the official Debian archive. Following up from a previous post on the reproducibility of Trisquel, Simon notes that “typically [the] rebuilds do not match the official packages, even when they say the package is reproducible”, Simon correctly identifies that “the purpose of [these] rebuilds are not to say anything about the official binary build, instead the purpose is to offer a QA service to maintainers by performing two builds of a package and declaring success if both builds match.”
However, Simon’s post swiftly moves on to announce a new tool called debdistrebuild that performs rebuilds of the difference between two distributions in a GitLab pipeline and displays diffoscope output for further analysis.
AROMA: Automatic Reproduction of Maven Artifacts
Mehdi Keshani, Tudor-Gabriel Velican, Gideon Bot and Sebastian Proksch of the Delft University of Technology, Netherlands, have published a new paper in the ACM Software Engineering on a new tool to automatically reproduce Apache Maven artifacts:
Reproducible Central is an initiative that curates a list of reproducible Maven libraries, but the list is limited and challenging to maintain due to manual efforts. [We] investigate the feasibility of automatically finding the source code of a library from its Maven release and recovering information about the original release environment. Our tool, AROMA, can obtain this critical information from the artifact and the source repository through several heuristics and we use the results for reproduction attempts of Maven packages. Overall, our approach achieves an accuracy of up to 99.5% when compared field-by-field to the existing manual approach [and] we reveal that automatic reproducibility is feasible for 23.4% of the Maven packages using AROMA, and 8% of these packages are fully reproducible.
Community updates
On our mailing list this month:
-
Nichita Morcotilo reached out to the community, first to share their efforts “to build reproducible packages cross-platform with a new build tool called
rattler-build
, noting that “as you can imagine, building packages reproducibly on Windows is the hardest challenge (so far!)”. Nichita goes onto mention that the Apple ecosystem appears to be usingZERO_AR_DATE
overSOURCE_DATE_EPOCH
. […] -
Roland Clobus announced that the Debian bookworm 12.6 live images are “nearly reproducible”, with more detail in the post itself and input in the thread from other contributors.
-
As reported in last month’s report, Pol Dellaiera completed his master thesis on Reproducibility in Software Engineering at the University of Mons, Belgium. This month, Pol announced this on the list with more background info. Since the master thesis sources have been available, it has received some feedback and contributions. As a result, an updated version of the thesis has been published containing those community fixes.
-
Daniel Gröber asked for help in getting the Yosys documentation to build reproducibly, citing issues in inter alia the PDF generation causing differing
CreationDate
metadata values. -
James Addison continued his long journey towards getting the Sphinx documentation generator to build reproducible documentation. In this thread, James concerns himself with the problem that even “when
SOURCE_DATE_EPOCH
is configured, Sphinx projects that have configured their copyright notices using dynamic elements can produce nonsensical output under some circumstances.” James’ query ended up generating a number of replies. -
Allen ‘gunner’ Gunner posted a brief update on the progress the core team is making towards introducing a Code of Conduct (CoC) such that it is “in place in time for the RB Summit in Hamburg in September”. In particular, gunner asks “if you are interested in helping with CoC design and development in the weeks ahead, simply email
rb-core@lists.reproducible-builds.org
and let us know”. […]
Android Reproducible Builds at IzzyOnDroid with rbtlog
On our mailing list, Fay Stegerman announced a new Reproducible Builds collaboration in the Android ecosystem:
We are pleased to announce “Reproducible Builds, special client support and more in our repo”: a collaboration between various independent interoperable projects: the IzzyOnDroid team, 3rd-party clients Droid-ify & Neo Store, and
rbtlog
(part of my collection of tools for Android Reproducible Builds) to bring Reproducible Builds to IzzyOnDroid and the wider Android ecosystem.
Extending the Scalability, Flexibility and Responsiveness of Secure Software Update Systems
Congratulations to Marina Moore of the New York Tandon School of Engineering who has submitted her PhD thesis on Extending the Scalability, Flexibility and Responsiveness of Secure Software Update Systems. The introduction outlines its contributions to the field:
[S]oftware repositories are a vital component of software development and release, with packages downloaded both for direct use and to use as dependencies for other software. Further, when software is updated due to patched vulnerabilities or new features, it is vital that users are able to see and install this patched version of the software. However, this process of updating software can also be the source of attack. To address these attacks, secure software update systems have been proposed. However, these secure software update systems have seen barriers to widespread adoption. The Update Framework (TUF) was introduced in 2010 to address several attacks on software update systems including repository compromise, rollback attacks, and arbitrary software installation. Despite this, compromises continue to occur, with millions of users impacted by such compromises. My work has addressed substantial challenges to adoption of secure software update systems grounded in an understanding of practical concerns. Work with industry and academic communities provided opportunities to discover challenges, expand adoption, and raise awareness about secure software updates. […]
Development news
In Debian this month, 12 reviews of Debian packages were added, 13 were updated and 6 were removed this month adding to our knowledge about identified issues. A new toolchain issue type was identified as well, specifically ordering_differences_in_pkg_info
.
Colin Percival filed a bug against the LLVM compiler noting that building i386
binaries on the i386
architecture is different when building i386
binaries under amd64
. The fix was narrowed down to “x87 excess precision, which can result in slightly different register choices when the compiler is hosted on x86_64
or i386
” and a fix committed. […]
Fay Stegerman performed some in-depth research surrounding her apksigcopier tool, after some Android .apk
files signed with the latest apksigner
could no longer be verified as reproducible. Fay identified the issue as follows:
Since
build-tools
>= 35.0.0-rc1, backwards-incompatible changes toapksigner
breakapksigcopier
as it now by default forcibly replaces existing alignment padding and changed the default page alignment from 4k to 16k (same as Android Gradle Plugin >= 8.3, so the latter is only an issue when using older AGP). […]
She documented multiple available workarounds and filed a bug in Google’s issue tracker.
Lastly, diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb uploaded version 272
and Mattia Rizzolo uploaded version 273
to Debian, and the following changes were made as well:
-
Chris Lamb:
- Ensure that the
convert
utility is from ImageMagick version 6.x. The command-line interface has seemingly changed with the 7.x series of ImageMagick. […] - Factor out version detection in
test_jpeg_image
. […] - Correct the import of the
identify_version
method after a refactoring change in a previous commit. […] - Move away from using DSA OpenSSH keys in tests as support has been deprecated and removed in OpenSSH version 9.8p1. […]
- Move to
assert_diff
in thetest_openssh_pub_key
package. […] - Update copyright years. […]
- Ensure that the
-
Mattia Rizzolo:
- Add support for
ffmpeg
version 7.x which adds some extra context to the diff. […] - Rework the handling of OpenSSH testing of DSA keys if OpenSSH is strictly 9.7, and add an OpenSSH key test with a
ed25519
-format key […][…][…] - Temporarily disable a few packages that are not available in Debian testing. […][…]
- Stop ignoring the results of Debian testing in the continuous integration system. […]
- Adjust options in
debian/source
to make sure not to pack the Pythonsdist
directory into the binary Debian package. […] - Adjust Lintian overrides. […]
- Add support for
Website updates
There were a number of improvements made to our website this month, including:
-
Bernhard M. Wiedemann updated the
SOURCE_DATE_EPOCH
page to include instructions on how to create reproducible.zip
files from within Python using thezipfile
module. […] -
Chris Lamb fixed a potential duplicate heading on the Projects page. […]
-
Fay Stegerman added
rbtlog
to the Tools page […] and IzzyOnDroid to the Projects page […], also ensuring that the latter page was always sorted regardless of the ordering within the input data files. […] -
Holger Levsen added Linus Nordberg to our global list of contributors […] as well as made a number of changes to the page for the upcoming Reproducible Builds summit later this year […][…][…][…].
-
Mattia Rizzolo updated the Civil Infrastructure Platform logo […] and also updated the 2024 summit page […][…].
-
Nichita Morcotilo added
rattler-build
to the Projects page. […][…][…] -
Pol Dellaiera updated the Academic Publications page, adding two publications. […][…]
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
-
Bernhard M. Wiedemann:
armagetron
(date)blaspp
(hostname)cligen
(GnuTLSs date)cloudflared
(date)dpdk
(Sphinx doctrees)fonttosfnt/xorg-x11-fonts
(toolchain, date)gegl
(build machine details)gettext-runtime
(jar mtime)kf6-kirigami+kf6-qqc2-desktop-style
(race-condition)kubernetes1.26
(backport upstream fix for random path)lapackpp
(hostname)latex2html
(nochecks)libdb-4_8
(.jar
modification time)librcc
(already merged upstream)libreoffice
(strip.jar
mtimes +clucene-core
toolchain)maliit-keyboard
(nocheck)nautilus
(date)openblas
(CPU type, fixed)openssl-3
(random-related issue)python-ruff
(ASLR)python3
(date, parallelism/race)reproducible-faketools
(0.5.2)sphinx
(GZip modification time)sphinxcontrib
(gzip mtime)
-
Chris Lamb:
-
Fridrich Strba:
-
Evangelos Ribeiro Tzaras:
Reproducibility testing framework
The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In July, a number of changes were made by Holger Levsen, including:
- Grant
bremner
access to theionos7
node. […][…] - Perform a dummy change to force update of all jobs. […][…]
In addition, Vagrant Cascadian performed some necessary node maintenance of the underlying build hosts. […]
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
-
IRC:
#reproducible-builds
onirc.oftc.net
. -
Mastodon: @reproducible_builds@fosstodon.org
-
Mailing list:
rb-general@lists.reproducible-builds.org
-
Twitter: @ReproBuilds