Welcome to the April 2020 report from the Reproducible Builds project. In our regular reports we outline the most important things that we and the rest of the community have been up to over the past month.
What are reproducible builds? One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. But whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into seemingly secure software during the various compilation and distribution processes.
News
It was discovered that more than 725 malicious packages were downloaded thousands of times from RubyGems, the official channel for distributing code for the Ruby programming language. Attackers used a variation of “typosquatting” and replaced hyphens and underscores (for example, uploading a malevolent atlas-client
in place of atlas_client
) that executed a script that intercepted Bitcoin payments. (Ars Technica report)
Bernhard M. Wiedemann launched ismypackagereproducibleyet.org
, a service that takes a package name as input and displays whether the package is reproducible in a number of distributions. For example, it can quickly show the status of Perl as being reproducible on openSUSE but not in Debian. Bernhard also improved the documentation of his “unreproducible package” to add some example patches for hash issues. […].
There was a post on Chaos Computer Club’s website listing Ten requirements for the evaluation of “Contact Tracing” apps in relation to the SARS-CoV-2 epidemic. In particular:
4. Transparency and verifiability: The complete source code for the app and infrastructure must be freely available without access restrictions to allow audits by all interested parties. Reproducible build techniques must be used to ensure that users can verify that the app they download has been built from the audited source code.
Elsewhere, Nicolas Boulenguez wrote a patch for the Ada programming language component of the GCC compiler to skip -f.*-prefix-map
options when writing Ada Library Information files. Amongst other properties, these .ali
files embed the compiler flags used at the time of the build which results in the absolute build path being recorded via -ffile-prefix-map
, -fdebug-prefix-map
, etc.
In the Arch Linux project, kpcyrd reported that they held their first “rebuilder workshop”. The session was held on IRC and participants were provided a document with instructions on how to install and use Arch’s repro
tool. The meeting resulted in multiple people with no prior experience of Reproducible Builds validate their first package. Later in the month he also announced that it was now possible to run independent rebuilders under Arch in a “hands-off, everything just works™” solution to distributed package verification.
Mathias Lang submitted a pull request against dmd
, the canonical compiler for the ‘D’ programming languageto add support for our SOURCE_DATE_EPOCH
environment variable as well the other C preprocessor tokens such __DATE__
, __TIME__
and __TIMESTAMP__
which was subsequently merged. SOURCE_DATE_EPOCH
defines a distribution-agnostic standard for build toolchains to consume and emit timestamps in situations where they are deemed to be necessary. […]
The Telegram instant-messaging platform announced that they had updated to version 5.1.1 continuing their claim that they are reproducible according to their full instructions and therefore verifying that its original source code is exactly the same code that is used to build the versions available on the Apple App Store and Google Play distribution platforms respectfully.
Lastly, Hervé Boutemy reported that 97% of the current development versions of various Maven packages appear to have a reproducible build. […]
Distribution work
In Debian this month, 89 reviews of Debian packages were added, 21 were updated and 33 were removed this month adding to our knowledge about identified issues. Many issue types were noticed, categorised and updated by Chris Lamb, including:
captures_build_path_in_hd5_database_files
cargo_installs_crates2_json
nondeterministic_devhelp_documentation_generated_by_gtk_doc
ros_dynamic_reconfigure_captures_build_path
In addition, Holger Levsen filed a feature request against debrebuild
, a tool for rebuilding a Debian package given a .buildinfo
file, proposing to add --standalone
or --one-shot-mode
functionality.
In openSUSE, Bernhard M. Wiedemann made the following changes:
blender
(sort Creaddir
call, rejected upstream)guile/guix
(parallelism race condition)mingw32-filesystem/mingw32-binutils
(sortreaddir
, filesystem, toolchain)mingw64-filesystem/mingw64-binutils
(sortreaddir
, filesystem, toolchain)musescore
(non-deterministic.zip
files)OBS
(FTBFS in rebuild)perl-Image-Sane
(report hung build on a single core VM)ruby2.7
(date, already upstream)vtk
(drop unreproducible.pyc
file)
In Arch Linux, a rebuilder instance has been setup at reproducible.archlinux.org that is rebuilding Arch’s [core]
repository directly. The first rebuild has led to approximately 90% packages reproducible contrasting with 94% on the Reproducible Build’s project own ArchLinux status page on tests.reproducible-builds.org
that continiously builds packages and does not verify Arch Linux packages. More information may be found on the corresponding wiki page and the underlying decisions were explained on our mailing list.
Software development
diffoscope
Chris Lamb made the following changes to diffoscope, the Reproducible Builds project’s in-depth and content-aware diff utility that can locate and diagnose reproducibility issues (including preparing and uploading versions 139
, 140
, 141
, 142
and 143
to Debian which were subsequently uploaded to the backports repository):
-
Comparison improvements:
- Dalvik
.dex
files can also serve as APK containers so restrict the narrower identification of.dex
files to files ending with this extension and widen the identification of APK files to when file(1) discovers a Dalvik file. (#28) - Add support for Hierarchical Data Format (HD5) files. (#95)
- Add support for
.p7c
and.p7b
certificates. (#94) - Strip paths from the output of
zipinfo(1)
warnings. (#97) - Don’t uselessly include the JSON “similarity” percentage if it is “0.0%”. […]
- Render multi-line difference comments in a way to show indentation. (#101)
- Dalvik
-
Testsuite improvements:
- Add
pdftotext
as a requirement to run the PDFtest_metadata
text. (#99) - apktool 2.5.0 changed the handling of output of XML schemas so update and restrict the corresponding test to match. (#96)
- Explicitly list
python3-h5py
indebian/tests/control.in
to ensure that we have this module installed during a test run to generate the fixtures in these tests. […] - Correct parsing of
./setup.py test --pytest-args
arguments. […]
- Add
-
Misc:
Michael Osipov created a well-researched merge request to return diffoscope to using zipinfo
directly instead of piping input via /dev/stdin
in order to ensure portability to the BSD operating system […]. In addition, Ben Hutchings documented how --exclude
arguments are matched against filenames […] and Jelle van der Waa updated the LLVM test fixture difference for LLVM version 10 […] as well as adding a reference to the name of the h5dump
tool in Arch Linux […].
Lastly, Mattia Rizzolo also fixed in incorrect build dependency […] and Vagrant Cascadian enabled diffoscope to locate the openssl
and h5dump
packages on GNU Guix […][…], and updated diffoscope in GNU Guix to version 141 […] and 143 […].
strip-nondeterminism
strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. In April, Chris Lamb made the following changes:
- Add deprecation plans to all handlers documenting how — or if — they could be disabled and eventually removed, etc. (#3)
- Normalise
*.sym
files as Java archives. (#15) - Add support for custom
.zip
filename filtering and exclude two patterns of files generated by Maven projects in “fork” mode. (#13)
disorderfs
disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues.
This month, Chris Lamb fixed a long-standing issue by not drop UNIX groups in FUSE multi-user mode when we are not root (#1) and uploaded version 0.5.9-1
to Debian unstable. Vagrant Cascadian subsequently refreshed disorderfs in GNU Guix to version 0.5.9 […].
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
-
Bernhard M. Wiedemann:
elixir
(parallelism)gnutls
(build failure)moonjit/bcc
(compile-time CPU-detection)openstack
(backport of patch to drop unreproducible sphinx.pickle
files)x3270
(merged, update date patch)
-
Chris Lamb:
- #958301 filed against
dh-cargo
. - #956549 filed against
gmap
. - #956591 filed against
gpick
. - #956477 filed against
herbstluftwm
. - #956304 filed against
libcamera
. - #956589 filed against
libctl
. - #956408 filed against
minetest-mod-xdecor
. - #955783 filed against
netgen-lvs
. - #958110 filed against
nickle
. - #958381 filed against
nmrpflash
. - #958382 filed against
node-mqtt
. - #956473 filed against
sprai
. - #955501 filed against
yaz
. - #956583 filed against
xxhash
.
- #958301 filed against
In addition, Bernhard informed the following projects that their packages are not reproducible:
acoular
(report unknown non-determinism)cri-o
(report a date issue)gnutls
(reportcerttool
being unable to extend certificates beyond 2049)gnutls
(report copyright year variation)libxslt
(report a bug about non-deterministic output from data corruption)python-astropy
(report a future build failure in 2021)
Project documentation
This month, Chris Lamb made a large number of changes to our website and documentation in the following categories:
-
Community engagement improvements:
- Update instructions to register for Salsa on our Contribute page now that the signup process has been overhauled. […]
- Make it clearer that joining the
rb-general
mailing list is probably a first step for contributors to take. […] - Make our full contact information easier to find in the footer (#19) and improve text layout using bullets to separate sections […].
-
Accessibility:
-
General improvements:
- Add a new Academic publications page. (#22)
- Add Trezor to our list of affiliated projects. (#26)
- Add the JVM page to the documentation index (#17) and tidy the page itself a little […].
- Add a GNU Libtool pointer to the Archive metadata documentation page. […]
-
Internals:
- Move to using
jekyll-redirect-from
over manual redirect pages […][…] and add a redirect from/docs/buildinfo/
to/docs/recording/
. (#23) - Limit the website self-check to not scan generated files […] and remove the “old layout” checker now that I have migrated all them […].
- Move the news archive under the
/news/
namespace […] and improve formatting of archived news links […]. - Various improvements to the draft template generation. […][…][…][…]
- Move to using
In addition, Holger Levsen clarified exactly which month we ceased to do weekly reports […] and Mattia Rizzolo adjusted the title style of an event page […].
Marcus Hoffman also started a discussion on our website’s issue tracker asking for clarification on embedded signatures and Chris Lamb subsequently replied and asked Marcus to go ahead and propose a concrete change.
Testing framework
We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org
that, amongst many other tasks, tracks the status of our reproducibility efforts as well as identifies any regressions that have been introduced.
-
Chris Lamb:
- Print the build environment prior to executing a build. […]
- Drop a misleading
disorderfs-debug
prefix in log output when we change non-disorderfs things in the file and, as it happens, do not run disorderfs at all. […] - The CSS for the package report pages added a margin to all
<a>
HTML elements under<li>
ones, which was causing a comma/bullet spacing issue. […] - Tidy the copy in the project links sidebar. […]
-
Holger Levsen:
- General:
- Install
jekyll-redirect-from
as it now needed by the reproducible-builds.org website. […] - Improve/correct log parsing rules. […][…]
- Install
-
Debian:
- Reduce scheduling frequency of the buster distribution on the
arm64
architecture, etc.. […][…] - Show builds per day on a per-architecture basis for the last year on the Debian dashboard. […]
- Drop the Subgraph OS package set as development halted in 2017 or 2018. […]
- Update
debrebuild
to version from the latest version ofdevscripts
. […][…] - Add or improve various parts of the documentation. […][…][…]
- Reduce scheduling frequency of the buster distribution on the
-
Work on a Debian rebuilder:
- Integrate
sbuild
. […][…][…][…][…] - Select a random
.buildinfo
file and attempt to build and compare the result. […][…][…][…] - Improve output and related output formatting. […][…][…][…][…]
- Outline next steps for the development of the tool. […][…][…]
- Various refactoring and code improvements. […][…][…]
- Integrate
- General:
Lastly, Mattia Rizzolo fixed some log parsing code regarding potentially-harmless warnings from package installation […][…] and the usual build node maintenance was performed by Holger Levsen […][…][…] and Mattia Rizzolo […][…][…].
Misc news
On our mailing list this month, Santiago Torres asked whether we were still publishing releases of our tools to our website and Chris Lamb replied that this was not the case and fixed the issue. Later in the month Santiago also reported that the signature for the disorderfs
package did not pass its GPG verification which was also fixed by Chris Lamb.
Hans-Christoph Steiner of the Guardian Project asked whether there would be interest in making our website translatable which resulted in a WIP merge request being filed against the website and a discussion on how to track translation updates.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
-
IRC:
#reproducible-builds
onirc.oftc.net
. -
Twitter: @ReproBuilds • @reproducible_builds@fosstodon.org
-
Reddit: /r/ReproducibleBuilds
-
Mailing list:
rb-general@lists.reproducible-builds.org
This month’s report was written by Bernhard M. Wiedemann, Chris Lamb, Daniel Shahaf, Holger Levsen, Jelle van der Waa, kpcyrd, Mattia Rizzolo and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.