Welcome to the report from the Reproducible Builds project for January 2021. In our reports we outline the most important things that have happened in the world of reproducible builds in the past month. If you are interested in contributing to the project, please visit our Contribute page on our website.
There has been further discussion in security circles around the recent ‘SolarWinds’ supply-chain attack (covered in our report last month. This month, however, David A. Wheeler posted an article on the Linux Foundation’s blog titled Preventing Supply Chain Attacks like SolarWinds.
Noting that “assuming a system can never be broken into is a failing strategy”, David continues:
In the longer term, I know of only one strong countermeasure for this kind of attack: verified reproducible builds. A “reproducible build” is a build that always produces the same outputs given the same inputs so that the build results can be verified. A verified reproducible build is a process where independent organizations produce a build from source code and verify that the built results come from the claimed source code. Almost all software today is not reproducible, but there’s work to change this.
The Bootstrappable Builds project was started as an offshoot of the Reproducible Builds project during the latter’s 2016 summit in Berlin. A bootstrappable build takes the idea of reproducibility one step further, in some sense. The build of a target binary can be reproduced alongside the build of the tools required to do so. It is, conceptually, almost like building a house from a large collection of atoms of different elements.
Building software depends on the tools used to construct the binary, including compilers and build-automation tools, many of which depend on pre-existing binaries. Minimizing the reliance on opaque binaries for building our software ecosystem is the goal of the Bootstrappable Builds project.
The full article is available on the LWN website.
Outreachy is an initiative that funds three-month remote internships in free and open source software, with a focus and background on supporting diversity. The Reproducible Builds project is considering joining this round, and are seeking input and ideas for good proposals.
Examples of the kind of projects we are looking for include workflow changes, large refactoring work, new features of our tools, specific reproducibility fixes and so on. Ideas should fit in that sweet spot of requiring more time and energy than a weekend project, but are also not too complicated that they would take forever. For more information, please see Mattia’s announcement on our mailing list.
In recent months there has been preparatory work to enable the
reproducible=+fixfilepath build flag by default; enabling this
fixfilepath feature flag should fix reproducibility issues in an estimated 500-700 packages. In January, however, Guillem Jover uploaded
dpkg version 1.20.6 to Debian unstable with this flag enabled. Although a bug (#979570) was subsequently filed by Lisandro Damián Nicanor Pérez Meyer with the initial intention of pausing this change due to a problem with the Qt toolkit, it was closed after extensive discussion.
In recent weeks, Holger Levsen has been re-uploading a large number of Debian packages in an attempt to ensure they all have a related
.buildinfo file. Holger described his rationale and approach in a blog post in December titled On doing 540 no-source-change source-only uploads in two weeks. In January, however, Holger performed 2,940 of these uploads, resulting in the Debian bullseye being brought down down to eleven packages that lack these files (from over 3,500). Holger wrote about his progress on our mailing list, where he also describes how he intends to eliminate the remaining packages.
Lukas Puehringer, Frédéric Pierre and Holger Levsen collaborated to upload
apt-transport-in-toto version 0.1.0 into the
unstable distribution, and Lukas Puehringer prepared packages for
in-toto version 1.0.0 and python-securesystemslib 0.18.0 to the unstable distribution.
35 reviews of Debian packages were added, 58 were updated and 49 were removed this month adding to our extensive knowledge about identified issues. Chris Lamb identified two issue categories,
nondeterminstic_todo_identifiers_in_documentation_generated_by_doxygen. Thorsten Glaser also added a new
uid_and_gid_in_cmake-generated_pkzip issue type as well […].
Bernhard M. Wiedemann posted his monthly reproducible builds status report for openSUSE which mentions amongst other things that “4.10% of packages are not perfectly reproducible”.
Jelle van der Waa posted an overview of Arch Linux’s work on reproducible builds during 2020. Titled Arch Linux Reproducible Builds Progress 2020, it mentions (for example) that their rebuilderd tool has seen 13 releases since March 2020.
reprotest is our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences. This month, the following changes were made:
- Upload to Debian unstable. […]
diffoscope is our project in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it provides human-readable diffs from many kinds of binary format. This month, Chris Lamb made a large number of changes (including releasing version 164, version 165 and version 166):
sys.argvin our top-level temporary directory, in case it helps debug why temporary directories might not get cleaned up. […]
- Collapse the
--extended-filesystem-attributesto cover all of these extended attributes, defaulting the new option to false (ie. to not check these expensive external calls). […][…]
- Show the ‘fuzziness’ amount in percentage terms, not out of the rather-arbitrary ‘400’. […]
- Improve help text for the
- Wrap our external call to
cmp(1)with a missing profiling point. […]
jsondiffdifferences at 512 bytes, in case they consume the entire page. […]
- Improve the logging around fuzzy matching. […]
- Clarify in a comment that
__del__is not always called in Python, so temporary directories are not necessarily removed the moment they go out of scope. […]
- Print the free space in our temporary directory when we create it, not from within
- Tidy the
- Add a note regarding the special ordering of
test_all_tools_are_listedwithin that module. […]
- Clarify in a comment that
Other changes were made by:
- Introduce the
--no-xattr arguments(later collapsed to
--extended-filesystem-attributesby Chris Lamb) to improve performance. […]
- Avoid calling the external stat command. […]
- Avoid invoking external
diffcommand for short outputs that are identical. […]
- Log when the
cmpcommand is spawned. […]
- Improve performance of the
has_same_contentroutine by spawning
cmpless frequently. […]
- Cleanup the FIFO files when our context manager exits. […]
- Introduce the
- Add missing
otoolexternal tools, and add a test to make sure they are all listed. […]
- Fix a possible crash in the
- Filter the content of the
- Ignore/hide the
DeprecationWarningpertaining to the
impmodule deprecation as it comes from a 3rd-party library. […]
- Add a
pytest.inito explicitly generate JUnit’s xunit2 format. […]
- Override several Lintian warnings regarding prebuilt test binaries existing in the source tree. […]
- Add missing
strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. This month, Chris Lamb ensured that the tool did not process unwritable files (printing a warning in this case) (#980356) as well as a number of codebase improvements including reflowing logic to make larger future changes easier. […]
disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into system calls to reliably flush out reproducibility issues. This month, Chris Lamb updated the benchmarking tools to call a tool that will call
stat(2) repeatedly […] and Frédéric Pierret added an RPM spec file […] as well as the ability to prepend flags in
CXXFLAGS […]. Holger Levsen uploaded these changes to Debian unstable as version
0.5.11-1. disorderfs was also featured on Hacker News.
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
Bernhard M. Wiedemann:
blobby(Zip file embeds creation time)
dateutils(fix future build failure, also filed in openSUSE)
hamcrest(sort list by Pedro Monreal Gonzalez)
libcec(embeds hostname and user)
libpinyin(Address space layout randomisation issue)
pythia(embeds the build date)
python-distributed(build fails on single-CPU system)
python-distributed(build fails in the future due to expired SSL certificate)
ruby3.0(embeds date, process ID, etc.)
tiptop(embeds date and hostname)
vision(sort filesystem directory ordering)
- #950419 filed against
m4(includes date in generated documentation and
- #951031 filed against
libtool(remove dates from
- #968627 filed against
libjpeg-turbo(use UTC timestamp as the build date)
- #976307 filed against
sudo(different binaries when built on
- #978499 filed against
SOURCE_DATE_EPOCHfor timestamps in PDF files).
- #979019, #979021, #979023 and #979024 filed against
- #979112 filed against
- #979125 filed against
gfxboot(uploaded to Debian)
- #979593 filed against
gfxboot(#49): make the example themes reproducible.
- #950419 filed against
Hans-Christoph Steiner (F-Droid):
- Sync copy of debrebuild after a new merge request. […]
- In the rebuilder tool, enable debugging […], call debrebuild with
--timestamp=metasnap[…] and deal with
.buildinfofiles being created in current dir […].
- Add an additional maintainer address for the Debian Multimedia Maintainers team […] and update the maintainer address for OpenStack packages […].
- Don’t attempt to install diffoscope from sid on stretch. […]
- Detect diskspace issues on the main Jenkins node. […]
- In Arch Linux testing, drop support for
pkg.tar.zsthas taken over. […]
- Create a preliminary
README.txtfor buildinfos.debian.net. […]
- Update hard-coded instance of
2020- happy new year! […]
- Update the
- Update the
- Use a lockfile to ensure builders do not start when not required. […][…]
- When powercycling
arm64nodes, use the unprivileged user instead of what is locally configured as ‘root’. […]
- Remove the “Static Analysis Utilities” Jenkins plugin. […]
- Update the deployment tool to set the correct
HOMEenvironment variable when running Git to avoid printing warnings. […]
- Perform a large number of Ubuntu-related configuration changes. […][…][…][…][…][…][…]
- Update various host lists. […][…]
- (Re-)add some handy shortcuts. […]
Chris Lamb updated the main Reproducible Builds website and documentation including adding a missing image […] and updated a script to ignore commits that start with, for example, ‘
2020 12’ when generating commit listings […].
On our mailing list this month, however:
Fredrik Strömberg posted to the list to mention that he has been working for many years on establishing trust between end-users and a service operators and infrastructure. In particular, he has been working on a new security architecture called System Transparency. Fredrick has previously written two introductory blog posts, including System Transparency is the future as well as Open-source firmware is the future which outlines his thinking in more detail, but he intends that System Transparency “enters production use sometime this year”.
Felix C. Stegerman started a discussion around deterministic Python compiled bytecode files to result in reproducible Android packages which attracted a number of replies. Michael Biebl also got in touch to ask for help on making the SystemD package build reproducibility on the various ARM architectures on Debian. […]
We also first wrote about the Threema messaging application back in September 2020. This month, however, the Threema developers continued a discussion on our mailing list, particularly around applications on various application stores.
Lastly, David Wheeler asked how we how could we accelerate [the] deployment of verified reproducible builds? which was related to another thread continued from December, Attack on SolarWinds could have been countered by reproducible builds regarding how to leverage media coverage to get ‘buy in’ from upstream developers.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via: