Welcome to the January 2020 report from the Reproducible Builds project. In our reports we outline the most important things that we have been up to. In this month’s report, we cover:
- Upstream news & event coverage — Reproducing the Telegram messenger, etc.
- Software development — Updates and improvements to our tooling
- Distribution work — More work in Debian, openSUSE & friends
- Misc news — From our mailing list & how to get in touch etc.
What are reproducible builds?
Whilst anyone can inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. The motivation behind the reproducible builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised.
If you are interested in contributing, please visit the Contribute page on our website.
Upstream news & event coverage
The Telegram messaging application has documented full instructions for verifying that its original source code is exactly the same code that is used to build the versions available on the Apple App Store and Google Play.
Reproducible builds were mentioned in a panel on Software Distribution with Sam Hartman, Richard Fontana, & Eben Moglen at the Software Freedom Law Center’s 15h Anniversary Fall Conference (at ~35m21s).
Vagrant Cascadian will present a talk at SCALE 18x in Pasadena, California on March 8th titled There and Back Again, Reproducibly.
Matt Graeber (@mattifestation) posted on Twitter that:
If you weren’t aware of the reason Portable Executable timestamps in Win 10 binaries were nonsensical, Raymond’s post explains the reason: to support reproducible builds.
… referencing an article by Raymond Chen from January 2018 which, amongst other things, mentions:
One of the changes to the Windows engineering system begun in Windows 10 is the move toward reproducible builds.
Jan Nieuwenhuizen announced the release of GNU Mes 0.22. Vagrant Cascadian subsequently uploaded this version to Debian which produced a bit-for-bit identical mescc-mes-static binary with the mes-rb5 package in GNU Guix.
Software development
diffoscope
diffoscope is our in-depth and content-aware diff-like utility that can locate and diagnose reproducibility issues. It is run countless times a day on our testing infrastructure and is essential for identifying fixes and causes of nondeterministic behaviour.
This month, diffoscope versions 135 and 136 were uploaded to Debian unstable by Chris Lamb. He also made the following changes to diffoscope itself, including:
- 
    New features: - Support external difference tools such as Meld, etc. similar to git-difftool(1). (#87)
- Extract resources.arscfiles as well asclasses.dexfrom Android.apkfiles to ensure that we show the differences there. (#27)
- Fallback to the regular .zipcontainer format for.apkfiles ifapktoolis not available. […][…][…][…]
- Drop --max-report-size-childand--max-diff-block-lines-parent; scheduled for removal in January 2018. […]
- Append a comment to a difference if we fallback to a less-informative container format but we are missing a tool. […][…]
 
- Support external difference tools such as Meld, etc. similar to 
- 
    Bug fixes: - No longer raise a KeyErrorexception if we request an invalid member from a directory container. […]
 
- No longer raise a 
- 
    Documentation/workflow improvements: 
- 
    Logging improvements: - Log a debug-level message if we cannot open a file as container due to a missing tool to assist in diagnosing issues. […]
- Correct a debug message related to compare_metacalls to quote the arguments correctly. […]
- Add the current PATHenvironment variable to theNormalising locale...debug-level message. […]
- Print the Starting diffoscope $VERSIONline as the first line of the log as we are, well, starting diffoscope. […]
- If we don’t know the HTML output name, don’t emit an enigmatically truncated HTML output fordebug message. […]
 
- 
    Tests: - Don’t exhaustively output the entire HTML report when testing the regression for #875281; parsing the JSON and pruning the tree should be enough. (#84)
- Refresh and update the fixtures for the .icotests to match the latest version of Imagemagick in Debian unstable. […]
 
- 
    Code improvements: - Add a .git-blame-ignore-revsfile to improve the output ofgit-blame(1)by ignoring large changes when introducing the Black source code reformatter and update theCONTRIBUTING.mdguide on how to optionally use it locally. […]
- Add a noqaline to avoid a false-positive Flake8 “unused import” warning. […]
- Move logo.svgto under thedoc/directory […] and makesetup.pyexecutable […].
- Tidy diffoscope.main’sconfiguremethod. […][…][…][…]
- Drop an assertion that is guaranteed by parallel ifconditional […] and an unused “Difference” import from the APK comparator. […]
- Turn down the “volume” for a recommendation in a comment. […]
- Rename the diffoscope.localemodule todiffoscope.environas we are modifying things beyond just the locale (eg. callingtzset, etc.) […]
- Factor-out the generation of foo not available in pathcomment messages into the exception that raises them […] and factor out running all of our manyzipinfointo a new method […].
 
- Add a 
- 
    trydiffoscope is the web-based version of diffoscope. This month, Chris Lamb fixed the PyPI.org release by adding the trydiffoscopescript itself to theMANIFESTfile and performing another release cycle. […]
In addition, Marc Herbert adjusted the cbfstool tests to search for expected keywords in the output, rather than specific output […], fixed a misplaced debugging line […] and added a “Testing” section to the CONTRIBUTING.rst […] file. Vagrant Cascadian updated to diffoscope 135 in GNU Guix.
reprotest
reprotest is our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences. This month, versions 0.7.11 and 0.7.12 were uploaded to Debian unstable by Holger Levsen. This month, Iñaki Malerba improved the version test to split on the + character […] and Ross Vandegrift updated the code to allow the user to override timeouts from the surrounding environment […].
Holger Levsen also made the following additionally changes:
- Drop the shorttimeout and use theinstalltimeout instead. (#897442)
- Use “real” reStructuredText comments instead of using the rawdirective. […]
- Update the PyPI classifier to express we are using Python 3.7 now. […]
Other tools
- 
    disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues. This month, Chris Lamb fixed an issue by ignoring the return values of fsyncdirto ensure (for example)dpkg(1)can “flush”/var/lib/dpkgcorrectly […] and merged a change from Helmut Grohne to use the build architecture’s version of pkg-config to permit cross-architecture builds […].
- 
    strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. This month, version 1.6.3-2was uploaded to Debian unstable by Holger Levsen to bump the Standards-Version. […]
Upstream development
The Reproducible Builds project detects, dissects and attempts to fix as many unreproducible packages as possible. Naturally, we endeavour to send all of our patches upstream. This month, we wrote another large number of such patches, including:
- Arnout Engelen (for the NixOS distribution):
    - bash(enable- PGRP_PIPEregardless of build-time kernel version)
- jitterentropy(remove timestamps from Gzip-compressed manpages, already filed upstream)
- ms-sys(remove timestamps from- .gzmanpages, already upstream)
 
- Bernhard M. Wiedemann (for the openSUSE distribution):
    - ImageMagick(toolchain,- .pngdate)
- brickv(sort a Python- glob/- readdir(3))
- cpython(- .pycreproducibility)
- doxygen(merged a toolchain patch to prevent nondeterminism from ASLR)
- fastjet-contrib(sort- find/- readdir)
- openjfx(Java date)
- ruby(Reopen unsorted Ruby- globissue)
- rubygem-sassc(sort a Ruby- readdir(3))
 
- Chris Lamb:
    - #948279 filed against python-gmusicapi.
- #948582 filed against bochs.
- #948872 filed against pcbasic.
- #949379 filed against vmatch.
- #949580 filed against pkg-js-tools.
- #949684 filed against mcomix.
- #949817 filed against shotcut(forwarded upstream).
- #950138 filed against pikepdf(forwarded upstream).
 
- #948279 filed against 
- Jelle van der Waa (Arch Linux):
- Martin Liška:
    - gcc(toolchain, fixing randomness in some- .ofiles, with Alexander Monakov & Richard Biener)
 
- 
    Vagrant Cascadian submitted a large number patches via the Debian bug tracking system targeting the packages Civil Infrastructure Platform as identified by the CIP package set including: - #948757 & #948759 filed against apache2.
- #948771 filed against guile-2.2.
- #949114 & #949115 filed against alsa-tools.
- #949270 & #949271 filed against libtool.
- #949273 & #949275 filed against geoip.
- #949324 filed against groff.
- #949338 filed against gettext.
- #949341 filed against sqlite3.
- #949342 & #949343 filed against flex.
- #949346 & #949348 filed against libnet.
 
- #948757 & #948759 filed against 
Distribution work
openSUSE
In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update and submitted the following bugs and patches:
- doxygen(toolchain, ASLR; already merged upstream)
- frotz(version update & date)
- gcc9(report unreproducible- .ofiles, forwarded upstream)
- mingw*(report random filename in- .afiles)
- perl-TimeDate(fix a “year 2020” bug, forwarded upstream)
- python-sherpa(CPU-detection via- --mtune=native)
- qpress(make PGO reproducible)
- rubygem-sassc(CPU &- readdir, partially submitted upsteam)
- stgit(recreate unreproducible- .pyc files with fixed filesystemreaddir(3)` order)
- xmvn(report nondeterminism from filesystem order and randomness)
Many Python packages were updated to avoid writing .pyc files with an embedded random path, including jupyter-jupyter-wysiwyg, jupyter-jupyterlab-latex, python-PsyLab, python-hupper, python-ipyevents (don’t rewrite .zip file), python-ipyleaflet, python-jupyter-require, python-jupyter_kernel_test, python-nbdime (do not rewrite .zip, avoid time-based .pyc), python-nbinteract, python-plaster, python-pythreejs, python-sidecar & tensorflow (use pip install --no-compile).
Debian
There was yet more progress towards making the Debian Installer images reproducible. Following-on from last months’ efforts, Chris Lamb requested a status update on the Debian bug in question.
Daniel Schepler posted to the debian-devel mailing list to ask whether “running dpkg-buildpackage manually from the command line” is supported, particularly with respect to having extra packages installed during the package was built either resulted in a failed build or even broken packages (eg. #948522, #887902, etc.). Our .buildinfo files could be one solution to this as they record the environment at the time of the package build.
Holger disabled scheduling of packages from the “oldstable” stretch release on tests.reproducible-builds.org. This is the first time since stretch’s existence that we are no longer testing this release.
OpenJDK, a free and open-source implementation of the Java Platform was updated in Debian to incorporate a number of patches from Emmanuel Bourg, including:
- Make the generated character data source files reproducible. (#933339)
- Make the generated module-info.javafiles reproducible. (#933342)
- Make the generated copyright headers reproducible. (#933349)
- Make the build user reproducible. (#933373)
83 reviews of Debian packages were added, 32 were updated and 96 were removed this month adding to our knowledge about identified issues. Many issue types were updated by Chris Lamb, including timestamp_in_casacore_tables, random_identifiers_in_epub_files_generated_by_asciidoc, nondeterministic_ordering_in_casacore_tables, captures_build_path_in_golang_compiler, captures_build_path_via_haskell_adddependentfile & png_generated_by_plantuml_captures_kernel_version_and_builddate`.
Lastly, Mattia Rizzolo altered the permissions and shared the notes.git repository which underpins the aforementioned package classifications with the entire “Debian” group on Salsa, therefore giving all DDs write access to it. This is an attempt to invite more direct contributions instead of merge requests.
Other distributions
The FreeBSD Project Tweeted that:
Reproducible builds are turned on by default for
-RELEASE[…]
… which targets the next released version of this distribution (view revision). Daniel Ebdrup followed-up to note that this option:
Used to be turned on in
-CURRENTwhen it was being tested, but it has been turned off now that there’s another branch where it’s used, whereas-CURRENThas more need to have the revision printed inuname(which is one of the things that make a build unreproducible). […]
For Alpine Linux, Holger Levsen disabled the builders run by the Reproducible Builds project as our patch to the abuild utility (see  December’s report doesn’t apply anymore and thus all builds have become unreproducible again. Subsequent to this, a patch was merged upstream. […]
In GNU Guix, on January 14th, Konrad Hinsen posted a blog post entitled Reproducible computations with Guix which, amongst other things remarks that:
The [
guix time-machinecommand] machine actually downloads the specified version of Guix and passes it the rest of the command line. You are running the same code again. Even bugs in Guix will be reproduced faithfully!
The Yocto Project reported that they have reproducible cross-built binaries that are independent of both the underlying host distribution the build is run on and independent of the path used for the build. This is now being continually tested on the Yocto Project’s automated infrastructure to ensure this state is maintained in the future.
Project website & documentation
There was more work performed on our website this month, including:
- 
    Chris Lamb: - Python SOURCE_DATE_EPOCHdocumentation, clarifying that the second example generates a Pythonstr-type, not adatetime.datetime[…]
- Correct word omissions in the report template. […]
- Link to our mailing list overview page (and not the archives). […]
- Apply the Black source code reformatter to the draft generation script. […]
- Move continuous tests heading level to <h1>(vs.<h2>) to match the other pages. […]
- Calculate the report authors dynamically. […]
 
- Python 
- 
    Holger Levsen: - Add Alpine Linux to our projects and testing pages. […]
- Add links to our list of projects being tested […] and mark Fedora as being disabled at this time […].
 
In addition, Arnout Engelen added a Scala programming language example for the SOURCE_DATE_EPOCH environment variable […], David del Amo updated the link to the Software Freedom Conversancy to remove some double parentheses […] and Peter Wu added a Debian example for the -ffile-prefix-map argument to support Clang version 10 […].
Testing framework
We operate a fully-featured and comprehensive Jenkins-based testing framework that powers tests.reproducible-builds.org. This month, the following changes were made:
- Adrian Bunk:
    - Use the et_EElocale/language instead offr_CH. In Estonian, the z character is sorted between s and t which is contrary to common incorrect assumptions about the sorting order of ASCII characters.. […]
- Add ffile_prefix_map_passed_to_clangto the list of issues filtered as these build failures should be ignored. […]
- Remove the ftbfs_build_depends_not_available_on_amd64from the list of filtered issues as this specific problem no longer exists. […]
 
- Use the 
- 
    Holger Levsen: - Debian:
        - Always configure aptto ignore expired release files on hosts running in the future. […]
- Create an “oldsuites” page, showing suites we used to test in the past. […][…][…][…][…]
- Schedule more old packages from the buster distribution. […]
- Deal with shell escaping and other options. […][…][…]
- Reverse the suite ordering on the packages page. […][…]
- Show bullseye statistics on dashboard page, moving away from buster […] and additionally omit stretch […].
 
- Always configure 
- F-Droid:
        - Document the increased diskspace requirements; we require over 700 GiB now. […]
 
- Misc:
 
- Debian:
        
- 
    Jelle van der Waa (Arch Linux): 
- Mattia Rizzolo:
- Vagrant Cascadian special-cased u-booton thearmhfarchitecture: First, do not build theallarchitecture as the dependencies are not available on this architecture […] and also pass the--binary-archargument topbuildertoo […].
The usual node maintenance was performed by Mattia Rizzolo […][…], Vagrant Cascadian […][…][…][…] and Holger Levsen.
Misc news
On our mailing list this month:
- 
    Chris Lamb responded in-depth to a thread on Reproducible system images that was started in December by Lars Wirzenius. This then led to a sub-thread regarding reproducible Docker images. 
- 
    Holger Levsen posted a brief request for help regarding the bot that lives on our #reproducible-buildsIRC channel that interfaces with our Twitter handle.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can also get in touch with us via:
- 
    IRC: #reproducible-buildsonirc.oftc.net.
- 
    Twitter: @ReproBuilds 
- 
    Reddit: /r/ReproducibleBuilds 
- 
    Mailing list: rb-general@lists.reproducible-builds.org
This month’s report was written by Arnout Engelen, Bernhard M. Wiedemann, Chris Lamb, heinrich5991, Holger Levsen, Jelle van der Waa, Mattia Rizzolo and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.












