Reproducible Builds,
a very brief summary of the last 12 years and a glimpse into the future

Holger Levsen

Reproducible Builds
lack transparency logs,
can you help?!?

Holger Levsen

Who am I

  1. Holger Levsen / holger@debian.org, located in Hamburg, Germany. Born at 329 ppm. He/him. 🏳️‍🌈🏳️‍⚧️🖤😷
  2. Debian user since 1995, contributing since 2001, Debian member since 2007. I ❤️ Debian.
  3. Working on Reproducible Builds since 2014. Aiming to make all ❤️ Free Software reproducible.
  4. Since 2015 I've been convinced that transparency logs are desirable for any distributed software, alas...
  5. I'm here to present the work of many people:

according to https://reproducible-builds.org/who/people/

Akihiro Suda • akira • Alba Herrerias • Alex Feyerke • Alex Wilson • Alexander Bedrossian • Alexander Borkowski • Alexander Couzens (lynxis) • Alexis Bienvenüe • Allan Gunn (gunner) • Aman Sharma • Amit Biswas • Anders Kaseorg • Andrew Ayer • anonmos1 • Anoop Nadig • Arnaud Brousseau • Arnout Engelen • Aron Xu • Asheesh Laroia • Atharva Lele • Ben Hutchings • Benedikt Ritter • Benjamin Hof • Bernhard M. Wiedemann • billchenchina • boyska • Boyuan Yang • Brett Smith • Calum McConnell • Carl Dong • Ceridwen • Chengyu HAN • Chris Hofstaedtler • Chris Lamb • Chris Smith • Chris West • Christoph Berg • Christopher Baines • Cindy Kim • Clemens Lang • Clint Adams • Daan De Meyer • Dafydd Harries • Daniel Edgecumbe • Daniel Kahn Gillmor • Daniel Shahaf • Daniel Stender • David A. Wheeler • David Bremner • David del Amo • David Prévot • David Suarez • Davide Cavalca • Denis ‘GNUtoo’ Carikli • Dhiru Kholia • Dhole • Drakonis • Drew Fisher • Ed Maste • Edward Betts • Eitan Adler • Eli Schwartz • Elio Qoshi • Emanuel Bronshtein • Emmanuel Bourg • Esa Peuha • Evangelos Ribeiro Tzaras • Fabian Grünbichler • Fabian Keil • Fabian Wolff • FC (Fay) Stegerman • Feng Chai • Frédéric Pierret (fepitre) • Georg Faerber • Georg Koppen • Gonzalo Bulnes Guilpain • Graham Christensen • Greg Chabala • Guillem Jover • Hannes Mehnert • Hans-Christoph Steiner • Harlan Lieberman-Berg • heinrich5991 • Helmut Grohne • Hervé Boutemy • Holger Levsen (h01ger) • Hongxu Jia • hulkoba • HW42 • Ian Muchina • intrigeri • IOhannes m zmölnig • jajajasalu2 • Jakub Wilk • James Addison • James Fenn • Jan Nieuwenhuizen • Jan Zerebecki • Jan-Benedict Glaw • Jarl Gullberg • Javier Jardón • Jelle van der Waa • Jelmer Vernooij • Jochen Sprickerhof • Johannes Schauer Marin Rodrigues • John Neffenger • John Scott • Joshua Lock • Joshua Watt • Juan Picca • Julia Krüger • Julien Cristau • Julien Malka • Juri Dispan • Justin Cappos • Jérémy Bobbio (lunar) • kpcyrd • Kushal Das • Levente Polyak • Linus Nordberg • Liyun Li • Ludovic Courtès • Lukas Puehringer • Maliat Manzur • marco • Marco Villegas • MarcoFalke • Marcus Hoffmann (bubu) • Marek Marczykowski-Górecki • Maria Glukhova • Mariana Moreira • Mariano Giménez • marinamoore • Martin Suszczynski • Mathieu Bridon • Mathieu Parent • Matthew Suozzo • Mattia Rizzolo • Michael Pöhn • Michael R. Crusoe • Mike Perry • Morten Linderud • Muz • Mykola Nikishov • Nichita Morcotilo • Nick Gregory • Nicolas Boulenguez • Nicolas Vigier • Niels Thykier • Niko Tyni • Ninette Adhikari • Oejet • Omar Navarro Leija • opi • Orhun Parmaksız • Oskar Wirga • Paul Gevers • Paul Spooren • Paul Wise • Peter Conrad • Peter De Wachter • Peter Wu • Philip Rinn • Pol Dellaiera • Profpatsch • Rahul Bajaj • Reiner Herrmann • Richard Purdie • Robbie Harwood • Robin Candau • Roland Clobus • Russ Cox • RyanSquared • Santiago Torres • Santiago Vila • Sascha Steinbiss • Satyam Zode • Scarlett Clark • Seb35 • Sebastian Crane • Sebastian Davids • Sertonix • Seth Schoen • Simon Butler • Simon Josefsson • Simon Schricker • Snahil Singh • Stefano Rivera • Stefano Zacchiroli • Steven Adger • Steven Chamberlain • Stéphane Glondu • Sune Vuorela • Sylvain Beucler • Thomas Vincent • Tianon Gravi • Tim Jones • Tobias Stoeckmann • Tom Fitzhenry • Ulrike Uhlig • Vagrant Cascadian • Valentin Lorentz • Valerie R Young • Vipul • Wookey • Ximin Luo • Zbigniew Jędrzejewski-Szmek

About you

  • Who knows about Reproducible Builds, why and how?
  • Who contribute(s|d) to Reproducible Builds somewhere?
  • Who knows that Reproducible Builds have been known for more than 10 years? >30 years?
  • Who knows about SBOM? (Software Bill of Materials) ~= our .buildinfo files designed in 2014!

Introduction

The problem

  • Source code of free software available
  • …most people install pre-compiled binaries
  • No one really knows how they really correspond (even those building those binaries).
  • As a result there are various classes of supply chain attacks.

https://reproducible-builds.org/docs/definition/

  • When is a build reproducible?
  • A build is reproducible if given the same source code, build environment and build instructions, any party can recreate bit-by-bit identical copies of all specified artifacts.
  • The relevant attributes of the build environment, the build instructions and the source code as well as the expected reproducible artifacts are defined by the authors or distributors. The artifacts of a build are the parts of the build results that are the desired primary output.

Our mission

  • Enable anyone to independently verify that a given source produces bit by bit identical results.
  • Reproducible Builds are an important building block in making supply chains more secure. Nothing more, nothing less.
  • (Un)secure software build reproducibly still remains (un)secure software. However, with reproducible builds you can be sure that you are running the software you want to be running, built from the sources you want to be using.

Our mission

  • Enable anyone to independently verify that a given source produces bit by bit identical results.
  • Most people will probably say: what does that even mean?



Our new slogan in the making...

  • Enabling supply chain security.

By 2025 Reproducible Builds has been widely understood:


  • https://reproducible-builds.org/resources/ (incl. these slides)
    https://reproducible-builds.org/docs/
    https://reproducible-builds.org/docs/publications/
  • https://www.whitehouse.gov/briefing-room/statements-releases/2021/06/08/...
    • requires "Software Bill of Material" (SBOM)s for govermental software
    • so far only recommends reproducible builds / verified SBOMs

Common reasons for unreproducibilities:

  • timestamps, timestamps, timestamps
  • timestamps, timestamps, timestamps
  • build paths, build paths
  • all the rest
  • (and somewhere in there there might be backdoors...)
  • SOURCE_DATE_EPOCH

    • Who knows about SOURCE_DATE_EPOCH?
    • Build time stamps are largly meaningless. SOURCE_DATE_EPOCH describes the time of the last modification of the source (in seconds since the Unix epoch).
    • The specification is from 2015 and was updated in 2017.
    • https://reproducible-builds.org/docs/source-date-epoch/
    • Supported by a lot of software today.

    diffoscope

    • Who knows about, or uses or has used diffoscope?
    • diffoscope tries to get to the bottom of what makes files or directories different. It will recursively unpack archives of many kinds and transform various binary formats into more human-readable form to compare them.
    • Text, HTML and/or JSON output
    • https://try.diffoscope.org
    • https://diffoscope.org

    diffoscope

  • File formats supported include: Android APK files, Android boot images, Android package resource table (ARSC), Apple Xcode mobile provisioning files, ar(1) archives, ASM Function, Berkeley DB database files, bzip2 archives, character/block devices, ColorSync colour profiles (.icc), Coreboot CBFS filesystem images, cpio archives, Dalvik .dex files, Debian .buildinfo files, Debian .changes files, Debian source packages (.dsc), Device Tree Compiler blob files, directories, ELF binaries, ext2/ext3/ext4/btrfs/fat filesystems, Flattened Image Tree blob files, FreeDesktop Fontconfig cache files, FreePascal files (.ppu), Gettext message catalogues, GHC Haskell .hi files, GIF image files, Git repositories, GNU R database files (.rdb), GNU R Rscript files (.rds), Gnumeric spreadsheets, GPG keybox databases, Gzipped files, Hierarchical Data Format database, HTML files (.html), ISO 9660 CD images, Java class files, Java .jmod modules, JavaScript files,
  • diffoscope

  • JPEG images, JSON files, Linux kernel images, LLVM IR bitcode files, local (UNIX domain) sockets and named pipes (FIFOs), LZ4 compressed files, lzip compressed files, macOS binaries, Microsoft Windows icon files, Microsoft Word .docx files, Mono ‘Portable Executable’ files, Mozilla-optimized .ZIP archives, Multimedia metadata, OCaml interface files, Ogg Vorbis audio files, OpenOffice .odt files, OpenSSH public keys, OpenWRT package archives (.ipk), PDF documents, PE32 files, PGP signatures, PGP signed/encrypted messages, PNG images, PostScript documents, Public Key Cryptography Standards (PKCS) files (version #7), Python pyc files, RPM archives, Rust object files (.deflate), Sphinx inventory files, SQLite databases, SquashFS filesystems, symlinks, tape archives (.tar), tcpdump capture files (.pcap), text files, TrueType font files, U-Boot legacy image files, WebAssembly binary module, XML binary schemas (.xsb), XML files, XMLB files, XZ compressed files, ZIP archives and Zstandard compressed files.
  • Fallback on hexdump comparison, fuzzy-matching to handle renamings, and much more!
  • Reproducible Builds Summits

    • 2015 Athens
    • 2016/2017 Berlin
    • 2018 Paris
    • 2019 Marrakech
    • 2022 Venice
    • 2023/2024 Hamburg
    • 2025 Vienna

    Projects at Reproducible Builds Summits

    Alpine Linux, Apache Maven, Apache Security, Arch Linux, baserock, Bazel, bootstrappable.org, Buildroot, CHAINS (KTH Royal Institute of Technology), coreboot, CoyIM, Debian, Eclipse Adoptium, EdgeBSD, ElectroBSD, F-Droid, Fedora, FreeBSD, GitHub, GNU Guix, GNU Mes, Google, Guardian Project, Homebrew, Huawei, Indiana University (IU), in-toto, IPFS, JustBuild, LEAP, LEDE, LibreOffice, Linux, MacPorts, Max Planck Institute for Security and Privacy (MPI-SP), Microsoft, MirageOS, Mobian, NetBSD, New York University (NYU), NixOS, Octez / Tezos, openSUSE, OpenWrt, pantsbuild.org, phosh, pkgsrc, privoxy, Project, Pure OS, Qubes OS, Quinel Ltd, rebuilderd, Red Hat, repeatr.io, riot-os.org, Rust, Software Freedom Conservancy, spytrap-adb, subuser.org, systemd, Tails, Tor Project, Ubuntu, University of Pennsylvania (UPenn) and Warpforge.

    (There were more but we were asked to only mention these.)

    Short summary of Reproducible Debian

    Reproducible Builds for some parts of Debian are a reality today:

    • individual packages, useful for both developers and some users. >95% of 37000 source packages build reproducibly by now
    • mmdebstrap --variant=apt trixie
    • reproducible docker/podman images: docker.debian.net
    • reproducible live images: cdimage.debian.org

    CI results for Debian unstable, 20250712

    4347 reprodubility related bugs fixed (mostly upstreamed), 262 patches pending...

    47362 bugs in 12 years ~= 11 per day

    we rebuild constantly and find lots of FTBFS bugs

    CI builders from 2015 until today and beyond

    CI builders are great, but we also need rebuilders. And we want to have both.

    https://reproduce.debian.net

    • Attempts to bit-for-bit identically rebuild each Debian binary package found in the distribution archive, using the .buildinfo file produced when the buildd originally built the package.
    • For each distributed package, rebuilderd calls debrebuild that calls debootsnap, mmdebstrap and finally sbuild to build that package within a user namespace.

    https://reproduce.debian.net

    • a rebuilderd instance, running since Q3 2024
    • rebuilding and comparing against what Debian distributes on ftp.debian.org.
    • rebuilderd is older but snapshot.debian.org was broken from 2019 until 2024...

    reproduce.debian.net vs tests.r-b.o/debian

    • The goal of reproduce.debian.net is to replicate the same build process that is used by Debian during package publication -- not to seek out additional sources of variance.
    • Variance testing, used to find factors that can prevent packages from rebuilding reproducibly, will continue at https://tests.reproducible-builds.org/debian/reproducible.html.

    about rebuilderd

    • support for rebuilding Arch, Fedora, Debian and Tails
    • rebuilderd, rebuilderd-worker, rebuilderctl
    • written in Rust by kpcyrd, development started in 2019 during Marrakech summit
    • available at https://github.com/kpcyrd/rebuilderd - installation with apt, pacman -S, apk add, sudo make install, soon with dnf too
    • several instances for Arch exist (about 5), one instance for Fedora exists and so far, AFAIK, now also five for Debian

    https://reproduce.debian.net

    • trixie, forky, unstable, experimental
    • trixie-security, trixie-proposed-updates, trixie-updates, trixie-backports
    • arch:all, amd64, arm64, armhf, i386, ppc64el, riscv64
    • coming soon: s390
    • This will be used for testing migration soon: eventually unreproducible packages will not be allowed to enter testing anymore.

    Arch Linux 2015-2025

    • 2015 - pacman records BUILDINFO
    • 2017 - pacman S_D_E support & archlinux-repro
    • 2019 - started archiving packages required for rebuilds
    • 2020 - rebuilderd instance, [core] 86%
    • 2024 - reproducible minimal container userland
    • 2025 - 12% left to make reproducible (4 for minimal bootable install)

    https://gitlab.archlinux.org/archlinux/rebuilderd-website

    https://dashboards.archlinux.org/d/PKkRg-FGz/rebuilderd

    NixOS

    • https://luj.fr/blog/is-nixos-truly-reproducible.html - blog post by Julien Malka, summarizing his research article https://hal.science/hal-04913007.
    • The article explores the proportion of bitwise reproducible packages in the Nix package repository and its evolution between 2017 and 2023.
    • "Our most important finding is that the reproducibility rate in nixpkgs has increased steadily from 69% in 2017 to about 91% in April 2023."

    NixOS

    • Talk yesterday in the Nix and NixOS track:
      https://fosdem.org/2025/schedule/event/fosdem-2025-4430-how-reproducible-is-nixos-/

    FreeBSD

    • Talk at FOSDEM 2016 by Baptiste Daroussin: Reproducible builds in FreeBSD packages
    • FreeBSD base system continously tested on jenkins.debian.net since 2015. Just as NetBSD is :)
    •  
    •  
    •  
    •  

    FreeBSD and NetBSD

    • Talk at FOSDEM 2016 by Baptiste Daroussin: Reproducible builds in FreeBSD packages
    • FreeBSD base system continously tested on tests.reproducible-builds.org since 2015. Just as NetBSD is :)
    • In 2016 there was WIP for reproducing FreeBSD ports and achieved 80%. And then this efford got stalled...
    • until now: https://freebsdfoundation.org/blog/zero-trust-builds-for-freebsd/
    • NetBSD: for most archs base system can be rebuild bit for bit identical on NetBSD and Linux...!

    FreeBSD

    • the zero-trust build project is scheduled from jan-aug 2025 and centers on the freebsd build process, and in particular, release building. the primary goal of this work is to enable the entire release process to run without requiring root access, and that build artifacts build reproducibly – that is, that a third party can build bit-for-bit identical artifacts.
    • [this] is one of five initiatives that together are aimed at advancing zero trust builds, software bill of materials (sbom), ci/cd automation, security controls in ports and packages, and technical debt reduction.

    How to reach 100% in practice

    • 100% reproducible is a political decision and nothing technical.
    • We need to change debian-policy!
    • We can work around 'must-have-offenders' using allowlists in the beginning.
    • The goal is still 100%, allowlists are just a way to achieve that goal eventually.
    • Penalizing testing migration is a means to enforce debian-policy though it can be done before it's policy.

    Debian policy

    • 2017: packages should build reproducibly.
    • 2025? reproducible packages must not regress.
    • 2025? NEW packages must build reproducibly.
    • 2027? packages must build reproducibly.
    • In practice the release team will probaby enforce this before it becomes policy. ☺️

    The path to 100%

    suitereproducibleunreproducible
    stretch 23040(93.2%) 1514
    buster 26653(93.9%) 1405
    bullseye 29698(96.2%) 761
    bookworm 33240(96.9%) 670
    trixie 35000 256
    forky 40000 128 (but no regressions or new pkgs)
    forky+1 45000 42 policy violations left
    forky+2 50000 0 (?!?!!! that's probably 2031)

    The path to 100% (using old CI numbers...)

    suitereproducibleunreproducible
    stretch 23040(93.2%) 1514
    buster 26653(93.9%) 1405
    bullseye 29698(96.2%) 761
    bookworm 33240(96.9%) 670
    trixie 35000 256
    forky 40000 128 (but no regressions or new pkgs)
    forky+1 45000 42 policy violations left
    forky+2 50000 0 (?!?!!! that's probably 2031)

    and there is more...

    but time is short. Still I want to mention:

    • Torbrowser which started it all - together with Bitcoin client to be fair
    • Tails
    • FDroid
    • Maven central
    • ...
    • https://bootstrappable.org and https://whatsrc.org

    Reproducible Builds and transparency logs

    • Transparency logs have been mostly out of scope for me personally, though I don't know of any other public efforts for distros.
    • Neither binary transparency nor signature transparency!
    • I'm here to collaborate and learn. Thanks for having me here. I'm also more than happy to help where I can.
    • Lets discuss in a breakout session?!
    • Be the change you want to see in the (FLOSS) world!

    Thank you
    … and all contributors out there!

    Any questions? 🤷

    #debian-reproducible on irc.oftc.net
    #reproducible-builds on irc.oftc.net
    rb-general@lists.reproducible-builds.org
    Holger Levsen / h01ger / holger@reproducible-builds.org / holger@debian.org