Welcome to the March 2020 report from the Reproducible Builds project. In our reports we outline the most important things that we have been up to over the past month and some plans for the future.
What are reproducible builds?
One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security.
However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes.
News
The report from our recent summit in Marrakesh was published and is now available in both PDF and HTML formats. A sincere thank you to all of the Reproducible Builds community for the input to the event a sincere thank you to Aspiration for preparing and collating this report.
Harmut Schorrig published a detailed document on how to compile Java applications in such as way that the .jar
build artefact is reproducible across builds. A practical and hands-on guide, it details how to avoid unnecessary differences between builds by explicitly declaring an encoding as the default value differs across Linux and MS Windows systems and ensuring that the generated .jar
— a variant of a .zip
archive — does not embed any nondeterministic filesystem metadata, and so on.
Janneke gave a quick presentation on GNU Mes and reproducible builds during the lighting talk session at LibrePlanet 2020. […]
Vagrant Cascadian presented There and Back Again, Reproducibly! video at SCaLE 18x in Pasadena in California which generated some attention on Twitter.
Hervé Boutemy mentioned on our mailing list in a thread titled Rebuilding and checking Reproducible Builds from Maven Central repository that since the update of a central build script (the “parent POM”) every Apache project using the Maven build system should build reproducibly. A follow-up discussion regarding how to perform such rebuilds was also started on the Apache mailing list.
The Telegram instant-messaging platform announced that they had updated their iOS and Android OS applications and claim that they are reproducible according to their full instructions, verifying that its original source code is exactly the same code that is used to build the versions available on the Apple App Store and Google Play distribution platforms respectfully.
Hervé Boutemy also reported about a new project called reproducible-central
which aims to allow anyone to rebuild a component from the Maven Central Repository that is expected to be reproducible and check that the result is as expected.
In last month’s report we detailed Omar Navarro Leija’s work in and around an academic paper titled Reproducible Containers which describes in detail the workings of a user-space container tool called dettrace
(PDF). Since then, the PhD student from the University Of Pennsylvania presented on this tool at the ASPLOS 2020 conference in Lausanne, Switzerland. Furthermore, there were contributions to dettrace
from the Reproducible Builds community itself. […][…]
Distribution work
openSUSE
In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update as well as made the following changes within the distribution itself:
avfs
(report build problem in%post
script)arj
(fix incorrect use ofstrcpy
, submitted upstream)brickv
(update get upstream fix)fvwm-themes
(delta between architectures innoarch
package)libpeas
(report build failure in single-CPU mode)pmix
(update to incoporate upstream fix)pw3270
(date variation, forwarded upstream)python-mailmanclient
(report build failure in single-CPU mode)ripgrep
(CPU, forwarded upstream)tensorflow2
(avoid random temporary directory path)tesseract-ocr
(drop “native” architecture optimisations)vlc
(fixed “ghost” file size and sort archive, already upstream)
Debian
Chris Lamb further refined his merge request for the debian-installer
component to allow all arguments from sources.list
files (such as “[check-valid-until=no]
”) in order that we can test the reproducibility of the installer images on the Reproducible Builds own testing infrastructure. (#13)
Holger Levsen filed a number of bug reports against the debrebuild
tool that attempts to rebuild a Debian package given a .buildinfo
file as input, including:
- Accepting signed
.buildinfo
files. (#955050) - Two sbuild-related bugs. (#955123 & #955304)
- Specific adjustments to the APT configuration. (#955307, #955298 & #955280)
- Requests to improve the documentation in various ways. (#955049 & #955308)
48 reviews of Debian packages were added, 17 were updated and 34 were removed this month adding to our knowledge about identified issues. Many issue types were noticed, categorised and updated by Chris Lamb, including:
nondeterministic_gtk_icon_cache
[…]nondeterministic_ordering_in_documentation_generated_by_doxygen
[…]nondeterministic_vo_files_generated_by_coq
[…] utputrandomness_in_browserify_lite_output
[…]
Finally, Holger opened a bug report against the software running tracker.debian.org, a service for Debian Developers to follow the evolution of packages via web and email interfaces to request that they integrate information from buildinfos.debian.net
(#955434) and Chris Lamb kept isdebianreproducibleyet.com up to date. […]
Software development
diffoscope
Chris Lamb made the following changes to diffoscope, the Reproducible Builds project’s in-depth and content-aware diff utility that can locate and diagnose reproducibility issues, including preparing and uploading version 138
to Debian:
-
Improvements:
- Don’t allow errors with “R” script deserialisation cause the entire operation to fail, for example if an external library cannot be loaded. (#91)
- Experiment with memoising output from expensive external commands, eg.
readelf
. (#93) - Use
dumppdf
from thepython3-pdfminer
if we do not see any other differences frompdftext
, etc. (#92) - Prevent a traceback when comparing two R
.rdx
files directly as theget_member
method will return a file even if the file is missing. […]
-
Reporting:
- Display the supported file formats into the package long description. (#90)
- Print a potentially-helpful message if the PyPDF2 module is not installed. […]
- Remove any duplicate comparator descriptions when formatting in the
--help
output or in the package long description. […] - Weaken “Install the X package to get a better output” message to “… may produce a better output” as the former is not actually guaranteed. […]
-
Misc:
- Ensure we only parse the recommended packages from
--list-debian-substvars
when we want them fordebian/tests/control
generation. […] - Add upstream metadata file […] and add a Lintian override for
upstream-metadata-in-native-source
as “we” are upstream. […] - Inline the
RequiredToolNotFound.get_package
method’s functionality as it is only used once. […] - Drop the deprecated “
py36 = [..]
” argument in thepyproject.toml
file. […]
- Ensure we only parse the recommended packages from
In addition, Vagrant Cascadian updated diffoscope in GNU Guix to version 138 […], as well as updating reprotest — our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences — to version 0.7.14 […].
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month we wrote a large number of such patches, including:
-
Bernhard M. Wiedemann (via openSUSE):
- arj (date variation)
- gnulib (date variation)
- gnulib (date variation)
- lasso (sort filesystem ordering)
- mono/at-spi-sharp (report nondeterminism from filesystem nondeterminism)
- python-M2Crypto (report security certs expiring in 2029)
- python-swifter (report single-CPU build failure)
- QT uic (report ASLR nondeterminism)
- tdiff (report single-CPU build failure)
- tensorflow (report ASLR-induced variation)
- volk (drop compile-time CPU detection)
-
Chris Lamb:
- #952990 filed against
pmemkv
(forwarded upstream) - #953071 filed against
ndisc6
. - #953117 filed against
infernal
. - #953263 filed against
beep
. - #953646 filed against
node-nodedbi
. - #954409 filed against
node-browserify-lite
. - #955009 filed against
font-manager
. - #955287 filed against
pdb2pqr
. - #955341 filed against
gucharmap
. - #955364 filed against
cloudkitty
. - isbg (report a non-deterministic documentation issue)
- #952990 filed against
Project documentation
There was further work performed on our documentation and website this month including Alex Wilson adding a section regarding using Gradle for reproducible builds in JVM projects […] and Holger Levsen added the report from our recent summit in Marrakesh […][…].
In addition, Chris Lamb made a number of changes, including correcting the syntax of some CSS class formatting […], improved some “filed against” copy a little better […] and corrected a reference to calendar.monthrange
Python method in a utility function. […]
Testing framework
We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org
that, amongst many other tasks, tracks the status of our reproducibility efforts as well as identifies any regressions that have been introduced.
This month, Chris Lamb reworked the web-based package rescheduling tool to:
- Require a HTTP
POST
method in the web-based scheduler as not only should HTTP GET requests be idempotent but this will allow many future improvements in the user interface. […][…][…] - Improve the authentication error message in said rescheduler to suggest that the developer’s SSL certificate may have expired. […]
In addition, Holger Levsen made the following changes:
- Add a new
ath97
subtarget for the OpenWrt distribution. - Revisit ordering of Debian suites; sort the experimental distribution last and reverse the ordering of suites to prioritise the suites in development. […][…][…]
- Schedule Debian buster and bullseye a little less in order to allow unstable to catch up on the
i386
architecture. […][…] - Various cosmetic changes to the web-based scheduler. […][…][…][…]
- Improve wordings in the node health maintenance output. […]
Lastly, Vagrant Cascadian updated a link to the (formerly) weekly news to our reports page […] and kpcyrd fixed the escaping in an Alpine Linux inline patch […]. The usual build nodes maintenance was performed by Holger Levsen […][…], Mattia Rizzolo […] and Vagrant Cascadian […][…].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
-
IRC:
#reproducible-builds
onirc.oftc.net
. -
Twitter: @ReproBuilds
-
Reddit: /r/ReproducibleBuilds
-
Mailing list:
rb-general@lists.reproducible-builds.org
This month’s report was written by Bernhard M. Wiedemann, Chris Lamb, Holger Levsen and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.