Welcome to the October 2021 report from the Reproducible Builds project!
This month Samanta Navarro posted to the oss-security
security mailing on a novel category of exploit in the .tar
archive format, where a single .tar
file contains different contents depending on the tar utility being used. Naturally, this has consequences for reproducible builds as Samanta goes onto reply:
Arch Linux uses libarchive (bsdtar) in its build environment. The default tar program installed is GNU tar. It is possible to create a source distribution which leads to different files seen by the build environment than compared to a careful reviewer and other Linux distributions.
Samanta notes that addressing the tar utilities themselves will not be a sufficient fix:
I have submitted bug reports and patches to some projects but eventually I had to conclude that the problem itself cannot be fixed by these implementations alone. The best choice for these tools would be to only allow archives which are fully compatible to standards but this in turn would render a lot of archives broken.
Reproducible builds, with its twin ideas of reaching consensus on the build outputs as well as precisely recording and describing the build environment, would help address this problem at a higher level.
Codethink announced that they had achieved ISO-26262 ASIL D Tool Certification, a way of determining specific safety standards for software. Codethink used open source tooling to achieve this, but they also leverage:
Reproducibility, repeatability and traceability of builds, drawing heavily on best-practices championed by the Reproducible Builds project.
Elsewhere on the internet, according to a comment on Hacker News, Microsoft are now comparing NPM Javascript packages with their original source repositories:
I got a PR in my repository a few days ago leading back to a team trying to make it easier for packages to be reproducible from source.
Lastly, Martin Monperrus started an interesting thread on our mailing list about Github, specifically that their “autogenerated release tarballs are not deterministic”. The thread generated a significant number of replies that are worth reading.
Events and presentations
-
PackagingCon is a conference for developers of package management software, their communities and other stakeholders. This virtual event, which will take place on the 9th and 10th November 2021, has a “mission is to bring different ecosystems together”. The schedule for the event is now available to view online.
-
The Linux Foundation’s OpenSSF group announced that the next OpenSSF quarterly town hall will take place on 15th November 2021. Registration is now open.
-
Last month, Wolfgang Mauerer gave a presentation at the MiniDebConf 2021 Regensburg about the Civil Infrastructure Platform that covered many subjects including Reproducible Builds. PDF slides of the talk are available, as is a video recording.
-
In addition, Trevor Rosen from SolarWinds presented at the Linux Foundation’s Supply Chain Security Con last month on incorporating in-toto into their build system. in-toto a framework to secure the integrity of software supply chains. Trevor also discusses building everything twice to validate the first build à la reproducible builds. (PDF slides)
-
Lastly, Mattia Rizzolo posted an update on the next Reproducible Builds in-person event to our mailing list: “currently we are thinking ahead to 2022”.
Community news
On our mailing list this month:
-
Jeremiah announced the release of version 1.4 of
stage0-posix
, part of a broader effort to provide an ultra-minimal “bootstrap seed” to increase trust in our software stack. -
Chris Lamb mentioned that Azure are offering free compute power for open source projects which “might be useful for one of the many rebuilder projects”.
-
kpcyrd announced the release of rebuilderd v0.15.0, but also linked to a Twitter thread that contains intro on how rebuilderd works and a walk-through on how to write custom integrations.
-
Fredrik Strömberg offered an update on the Sigsum project and some specific milestones within transparency logging efforts: “after a year of design iterations we have not only designed a transparency log but also decided to turn it into a project of its own”.
There were quite a few changes to the Reproducible Builds website and documentation this month as well, including Feng Chai updating some links on our ‘publications’ page […] and marco updated our project metadata around the Bitcoin Core building guide […].
Lastly, we ran another productive meeting on IRC during October. A full set of notes from the meeting is available to view.
Distribution work
Qubes was heavily featured in the latest edition of Linux Weekly News, and a significant section was dedicated to discussing reproducibility. For example, it was mentioned that the “Qubes project has been working on incorporating reproducible builds into its continuous integration (CI) infrastructure”. But the LWN article goes on to describe that:
The current goal is to be able to build the Qubes OS Debian templates solely from packages that can be built reproducibly. Templates in Qubes OS are VM images that can be used to start an application qube quickly based on the template. The qube will have read-only access to the root filesystem of the template, so that the same root filesystem can be shared with multiple application qubes. There are official templates for several variants of both Fedora and Debian, as well as community maintained templates for several other distributions.
You can view the whole article on LWN, and Frédéric also published a lengthy summary about their work on reproducible builds in Qubes as well for those wishing to learn more.
In Debian this month, 133 reviews of Debian packages were added, 81 were updated and 24 were removed this month, adding to Debian’s ever-growing knowledge about identified issues. A number of issues were categorised and added by Chris Lamb and Vagrant Cascadian too […][…][…]. In addition, work on alternative snapshot service has made progress by Frédéric Pierret and Holger Levsen this month, including moving from the existing host (snapshot.notset.fr) to snapshot.reproducible-builds.org (more info) — thanks to OSUOSL for the machine and hosting and Debian for the disks.
Finally, Bernhard M. Wiedemann posted his monthly reproducible builds status report.
diffoscope
diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb made the following changes, including preparing and uploading versions 186, 187, 188 and 189 to Debian
-
New features:
-
Bug fixes:
- Fix Python decompilation tests under Python 3.10+ […] and for Python 3.7 […].
- Don’t raise a traceback if we cannot unmarshal Python bytecode. This is in order to support Python 3.7 failing to load
.pyc
files generated with newer versions of Python. […] - Skip Python bytecode testing where we do not have an expected diff. […]
-
Codebase improvements:
In addition, Jelle van der Waa added external tool references for Arch Linux for ocamlobjinfo
, openssl
and ffmpeg
[…][…][…] and added Arch Linux as a Continuous Integration (CI) test target. […] and Vagrant Cascadian updated the testsuite to skip Python bytecode comparisons when file(1)
is older than 5.39. […] as well as added external tool references for the Guix distribution for dumppdf
and ppudump
. […][…]. Vagrant Cascadian also updated the diffoscope package in GNU Guix […][…].
Lastly, Guangyuan Yang updated the FreeBSD package name on the website […], Mattia Rizzolo made a change to override a new Lintian warning due to the new test files […], Roland Clobus added support to detect and log if the GNU_BUILD_ID
field in an ELF binary been modified […], Sandro Jäckel updated a number of helpful links on the website […] and Sergei Trofimovich made the uImage test output support file(
) version 5.41 […].
reprotest
reprotest is the Reproducible Build’s project end-user tool to build same source code twice in widely differing environments, checking the binaries produced by the builds for any differences.
This month, reprotest version 0.7.18
was uploaded to Debian unstable by Holger Levsen, which also included a change by Holger to clarify that Python 3.9 is used nowadays […], but it also included two changes by Vasyl Gello to implement “realistic” CPU architecture shuffling […] and to log the selected variations when the verbosity is configured at a sufficiently high level […]. Finally, Vagrant Cascadian updated reprotest to version 0.7.18 in GNU Guix.
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix unreproducible packages. We try to send all of our patches upstream where appropriate. We authored a large number of such patches this month, including:
-
Bernhard M. Wiedemann:
-
Chris Lamb:
- #901307 filed against
sphinx-gallery
(re-opened with extensive updates). - #995809 filed against
libinput
. - #995865 filed against
python-pipx
. - #996200 filed against
node-inquirer
. - #996674 filed against
libminidns-java
. - #996834 filed against
pytools
. - #996881 filed against
pikepdf
. - #996948 filed against
sphinx
(forwarded upstream) - #996999 filed against
fenics-basix
. - #997000 filed against
snakemake
. - #997689 filed against
smplayer
. - #997949 filed against
python-duniterpy
. - #998104 filed against
afnix
. - #998059 filed against
sphinx
(forwarded upstream).
- #901307 filed against
-
Vagrant Cascadian:
- #977684 filed against
mahimahi
(filed upstream). - #985187 filed against
ffmpeg
(forwarded upstream) - #995646 filed against
abntex
. - #995647 filed against
cfi
. - #995648 filed against
cffi
. - #995650 filed against
chktex
. - #995651 filed against
fdutils
. - #995652 filed against
gnu-standards
. - #995654 filed against
malaga
. - #995741 filed against
latex-mk
. - #995745 filed against
kannel
. - #995747 filed against
xnee
. - #995886, #995896, #995953 & #995954 filed against
cxref
. - #995960 filed against
xnee
. - #996184 filed against
binutils-or1k-elf
. - #996194 & #996572 filed against
gcc-arm-none-eabi
. - #996599 filed against
xdmf
. - #996679 filed against
flightgear
. - #997036 & #997037 filed against
kvirc
.
- #977684 filed against
Testing framework
The Reproducible Builds project runs a testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
-
Holger Levsen:
-
Mattia Rizzolo:
-
Debian-related changes:
- Handle schroot errors when invoking diffoscope instead of masking them. […][…]
- Declare and define some variables separately to avoid masking the subshell return code. […]
- Fix variable name. […]
- Improve log reporting. […]
- Execute
apt-get update
with the-q
argument to get more decent logs. […] - Set the Debian HTTP mirror and proxy for
snapshot.reproducible-builds.org
. […] - Install the
libarchive-tools
package (instead ofbsdtar
) when updating Jenkins nodes. […]
- Be stricter about errors when starting the node agent […] and don’t overwrite NODE_NAME so that we can expect Jenkins to properly set for us […].
- Explicitly warn if the
NODE_NAME
is not a fully-qualified domain name (FQDN). […] - Document whether a node runs in the future. […]
- Disable
postgresql_autodoc
as it not available in bullseye. […] - Don’t be so eager when deleting schroot internals, call to schroot -e to terminate the schroots instead. […]
- Only consider
schroot
underlays for deletion that are over a month old. […][…] - Only try to unmount
/proc
if it’s actually mounted. […] - Move the
db_backup
task to its own Jenkins job. […]
-
Lastly, Vasyl Gello added usage information to the reproducible_build.sh
script […].
Contributing
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
-
IRC:
#reproducible-builds
onirc.oftc.net
. -
Twitter: @ReproBuilds
-
Mailing list:
rb-general@lists.reproducible-builds.org