Welcome to the report from the Reproducible Builds project for January 2021. In our reports we outline the most important things that have happened in the world of reproducible builds in the past month. If you are interested in contributing to the project, please visit our Contribute page on our website.
There has been further discussion in security circles around the recent ‘SolarWinds’ supply-chain attack (covered in our report last month. This month, however, David A. Wheeler posted an article on the Linux Foundation’s blog titled Preventing Supply Chain Attacks like SolarWinds.
Noting that “assuming a system can never be broken into is a failing strategy”, David continues:
In the longer term, I know of only one strong countermeasure for this kind of attack: verified reproducible builds. A “reproducible build” is a build that always produces the same outputs given the same inputs so that the build results can be verified. A verified reproducible build is a process where independent organizations produce a build from source code and verify that the built results come from the claimed source code. Almost all software today is not reproducible, but there’s work to change this.
In addition, Episode 101 of the Ubuntu Security Podcast also covered the SolarWinds hack in further detail.
Elsewhere, the Bootstrappable Builds project was covered in depth by Jake Edge on Linux Weekly News. Jake introduced this sister project as follows:
The Bootstrappable Builds project was started as an offshoot of the Reproducible Builds project during the latter’s 2016 summit in Berlin. A bootstrappable build takes the idea of reproducibility one step further, in some sense. The build of a target binary can be reproduced alongside the build of the tools required to do so. It is, conceptually, almost like building a house from a large collection of atoms of different elements.
[…]
Building software depends on the tools used to construct the binary, including compilers and build-automation tools, many of which depend on pre-existing binaries. Minimizing the reliance on opaque binaries for building our software ecosystem is the goal of the Bootstrappable Builds project.
The full article is available on the LWN website.
Outreachy is an initiative that funds three-month remote internships in free and open source software, with a focus and background on supporting diversity. The Reproducible Builds project is considering joining this round, and are seeking input and ideas for good proposals.
Examples of the kind of projects we are looking for include workflow changes, large refactoring work, new features of our tools, specific reproducibility fixes and so on. Ideas should fit in that sweet spot of requiring more time and energy than a weekend project, but are also not too complicated that they would take forever. For more information, please see Mattia’s announcement on our mailing list.
Software development
Debian
In recent months there has been preparatory work to enable the reproducible=+fixfilepath
build flag by default; enabling this fixfilepath
feature flag should fix reproducibility issues in an estimated 500-700 packages. In January, however, Guillem Jover uploaded dpkg
version 1.20.6 to Debian unstable with this flag enabled. Although a bug (#979570) was subsequently filed by Lisandro Damián Nicanor Pérez Meyer with the initial intention of pausing this change due to a problem with the Qt toolkit, it was closed after extensive discussion.
In recent weeks, Holger Levsen has been re-uploading a large number of Debian packages in an attempt to ensure they all have a related .buildinfo
file. Holger described his rationale and approach in a blog post in December titled On doing 540 no-source-change source-only uploads in two weeks. In January, however, Holger performed 2,940 of these uploads, resulting in the Debian bullseye being brought down down to eleven packages that lack these files (from over 3,500). Holger wrote about his progress on our mailing list, where he also describes how he intends to eliminate the remaining packages.
Lukas Puehringer, Frédéric Pierre and Holger Levsen collaborated to upload apt-transport-in-toto
version 0.1.0 into the unstable
distribution, and Lukas Puehringer prepared packages for in-toto
version 1.0.0 and python-securesystemslib 0.18.0 to the unstable distribution.
35 reviews of Debian packages were added, 58 were updated and 49 were removed this month adding to our extensive knowledge about identified issues. Chris Lamb identified two issue categories, build_path_added_by_src2man_from_txt2man
and nondeterminstic_todo_identifiers_in_documentation_generated_by_doxygen
. Thorsten Glaser also added a new uid_and_gid_in_cmake-generated_pkzip
issue type as well […].
Other distributions
Bernhard M. Wiedemann posted his monthly reproducible builds status report for openSUSE which mentions amongst other things that “4.10% of packages are not perfectly reproducible”.
Jelle van der Waa posted an overview of Arch Linux’s work on reproducible builds during 2020. Titled Arch Linux Reproducible Builds Progress 2020, it mentions (for example) that their rebuilderd tool has seen 13 releases since March 2020.
reprotest
reprotest is our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences. This month, the following changes were made:
-
Frédéric Pierret:
-
Holger Levsen:
- Upload to Debian unstable. […]
-
Marek Marczykowski-Górecki:
diffoscope
diffoscope is our project in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it provides human-readable diffs from many kinds of binary format. This month, Chris Lamb made a large number of changes (including releasing version 164, version 165 and version 166):
-
New features:
- Save
sys.argv
in our top-level temporary directory, in case it helps debug why temporary directories might not get cleaned up. […] - Collapse the
--acl
and--xattr
arguments into--extended-filesystem-attributes
to cover all of these extended attributes, defaulting the new option to false (ie. to not check these expensive external calls). […][…]
- Save
-
Bug fixes:
-
Output improvements:
- Show the ‘fuzziness’ amount in percentage terms, not out of the rather-arbitrary ‘400’. […]
- Improve help text for the
--exclude-directory-metadata
argument. […] - Wrap our external call to
cmp(1)
with a missing profiling point. […] - Truncate
jsondiff
differences at 512 bytes, in case they consume the entire page. […] - Improve the logging around fuzzy matching. […]
-
Codebase improvements:
- Clarify in a comment that
__del__
is not always called in Python, so temporary directories are not necessarily removed the moment they go out of scope. […] - Print the free space in our temporary directory when we create it, not from within
diffoscope.main
. […] - Tidy the
diffoscope.comparators.utils.fuzzy
module. […] - Add a note regarding the special ordering of
test_all_tools_are_listed
within that module. […]
- Clarify in a comment that
Other changes were made by:
-
Conrad Ratschan:
-
Dimitrios Apostolou:
- Introduce the
--no-acl
and--no-xattr arguments
(later collapsed to--extended-filesystem-attributes
by Chris Lamb) to improve performance. […] - Avoid calling the external stat command. […]
- Avoid invoking external
diff
command for short outputs that are identical. […] - Log when the
cmp
command is spawned. […] - Improve performance of the
has_same_content
routine by spawningcmp
less frequently. […] - Cleanup the FIFO files when our context manager exits. […]
- Introduce the
-
Mattia Rizzolo:
- Add missing
lipo
andotool
external tools, and add a test to make sure they are all listed. […] - Fix a possible crash in the
--list-debian-substvars
command. […] - Filter the content of the
debian/*.substvars
files. […] - Ignore/hide the
DeprecationWarning
pertaining to theimp
module deprecation as it comes from a 3rd-party library. […] - Add a
pytest.ini
to explicitly generate JUnit’s xunit2 format. […] - Override several Lintian warnings regarding prebuilt test binaries existing in the source tree. […]
- Add missing
Other tools
strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. This month, Chris Lamb ensured that the tool did not process unwritable files (printing a warning in this case) (#980356) as well as a number of codebase improvements including reflowing logic to make larger future changes easier. […]
disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into system calls to reliably flush out reproducibility issues. This month, Chris Lamb updated the benchmarking tools to call a tool that will call stat(2)
repeatedly […] and Frédéric Pierret added an RPM spec file […] as well as the ability to prepend flags in CXXFLAGS
[…]. Holger Levsen uploaded these changes to Debian unstable as version 0.5.11-1
. disorderfs was also featured on Hacker News.
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
-
Arjen de Korte created a pull request for the PHP programming language to ensure that the ‘phar’ extension respects the
SOURCE_DATE_EPOCH
environment variable. -
Bernhard M. Wiedemann:
blobby
(Zip file embeds creation time)dateutils
(fix future build failure, also filed in openSUSE)hamcrest
(sort list by Pedro Monreal Gonzalez)keepalived
(date issue)libcec
(embeds hostname and user)libpinyin
(Address space layout randomisation issue)perf
(filesystem ordering)pythia
(embeds the build date)python-distributed
(build fails on single-CPU system)python-distributed
(build fails in the future due to expired SSL certificate)ruby3.0
(embeds date, process ID, etc.)tiptop
(embeds date and hostname)vision
(sort filesystem directory ordering)
-
Chris Lamb:
- #979134 filed against
apertium-anaphora
. - #980295 filed against
davs2
.
- #979134 filed against
-
Vagrant Cascadian:
- #950419 filed against
m4
(includes date in generated documentation and.info
files) - #951031 filed against
libtool
(remove dates from.info
and.html
documentation). - #968627 filed against
libjpeg-turbo
(use UTC timestamp as the build date) - #976307 filed against
sudo
(different binaries when built on/usr
-merged environment). - #978499 filed against
fop
(support usingSOURCE_DATE_EPOCH
for timestamps in PDF files). - #979019, #979021, #979023 and #979024 filed against
lirc
. - #979112 filed against
qdbm
. - #979125 filed against
gfxboot
(uploaded to Debian) - #979593 filed against
rox
. gfxboot
(#49): make the example themes reproducible.
- #950419 filed against
Testing framework
The Reproducible Builds project operates a large Jenkins-based testing framework that powers tests.reproducible-builds.org
. This month, the following changes were made:
-
Hans-Christoph Steiner (F-Droid):
-
Holger Levsen:
-
Debian:
- Sync copy of debrebuild after a new merge request. […]
- In the rebuilder tool, enable debugging […], call debrebuild with
--timestamp=metasnap
[…] and deal with.buildinfo
files being created in current dir […]. - Add an additional maintainer address for the Debian Multimedia Maintainers team […] and update the maintainer address for OpenStack packages […].
- Don’t attempt to install diffoscope from sid on stretch. […]
- Detect diskspace issues on the main Jenkins node. […]
- In Arch Linux testing, drop support for
.tar.xz
aspkg.tar.zst
has taken over. […] - Create a preliminary
README.txt
for buildinfos.debian.net. […] - Update hard-coded instance of
2020
- happy new year! […]
-
-
Johannes Schauer:
- Update the
README.txt
for thereproducible_pool_buildinfos.sh
script. […]
- Update the
-
Mattia Rizzolo:
- Use a lockfile to ensure builders do not start when not required. […][…]
- When powercycling
arm64
nodes, use the unprivileged user instead of what is locally configured as ‘root’. […] - Remove the “Static Analysis Utilities” Jenkins plugin. […]
- Update the deployment tool to set the correct
HOME
environment variable when running Git to avoid printing warnings. […] - Perform a large number of Ubuntu-related configuration changes. […][…][…][…][…][…][…]
- Update various host lists. […][…]
- (Re-)add some handy shortcuts. […]
Lastly, build node maintenance was performed by Holger Levsen […][…], Mattia Rizzolo […][…][…][…][…] and Vagrant Cascadian […][…][…].
Community news
Chris Lamb updated the main Reproducible Builds website and documentation including adding a missing image […] and updated a script to ignore commits that start with, for example, ‘2020 12
’ when generating commit listings […].
On our mailing list this month, however:
-
Fredrik Strömberg posted to the list to mention that he has been working for many years on establishing trust between end-users and a service operators and infrastructure. In particular, he has been working on a new security architecture called System Transparency. Fredrick has previously written two introductory blog posts, including System Transparency is the future as well as Open-source firmware is the future which outlines his thinking in more detail, but he intends that System Transparency “enters production use sometime this year”.
-
Felix C. Stegerman started a discussion around deterministic Python compiled bytecode files to result in reproducible Android packages which attracted a number of replies. Michael Biebl also got in touch to ask for help on making the SystemD package build reproducibility on the various ARM architectures on Debian. […]
-
We also first wrote about the Threema messaging application back in September 2020. This month, however, the Threema developers continued a discussion on our mailing list, particularly around applications on various application stores.
-
Lastly, David Wheeler asked how we how could we accelerate [the] deployment of verified reproducible builds? which was related to another thread continued from December, Attack on SolarWinds could have been countered by reproducible builds regarding how to leverage media coverage to get ‘buy in’ from upstream developers.
Contact
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
-
IRC:
#reproducible-builds
onirc.oftc.net
. -
Twitter/Mastodon: @ReproBuilds / @reproducible_builds@fosstodon.org
-
Reddit: /r/ReproducibleBuilds
-
Mailing list:
rb-general@lists.reproducible-builds.org