Reproducible builds status update


Chris Lamb (lamby)
Holger Levsen (h01ger)

The problem

  • Can inspect the source code of free software for flaws
  • But distributions provide binary/compiled packages

Can we trust this process?

  • To get users, go after the developers
  • Financial incentives to crack developer machines / build infrastructure
  • CVE-2002-0083: Remote root exploit in OpenSSH (single bit difference in binary)
  • Kernel module modifying source code when "viewed" by GCC only (see media.ccc.de)
  • Compromised Apple iOS SDK, Xcodeghost, etc.

The motivation behind "reproducible" builds is to allow verification that no flaws have been introduced during the compilation process.

The solution

  1. Ensure compilation always identical results
  2. Multiple parties compare compilation results
  3. Attacker needs to infect everybody simultaneously (or they are detected)

Challenges

    • Timestamps
    • Timezones & locales
    • Non-deterministic file ordering
    • Dictionary/hash key ordering
    • Users, groups, umask, environment variables
    • Build paths
    • Specifying the environment

Technical advantages

    • Faster to build; saves time, money & the environment
    • Easier to test changes/revisions
    • Unsafe behaviour (eg. internet access)
    • Unreliable / non-deterministic behaviours (eg. timing)
    • Finds bugs in uncommon timezones or locales
    • Detect corrupted build environments
    • Find future build failures (eg. expired certificates)

Reproducible builds in Debian

Continuously build every package twice, varying:

    • Time & date
    • Hostname & domain name
    • Filesystem (disorderfs)
    • Timezone & locale
    • uid & gid
    • GECOS information, the shell & a bunch of environment variables
    • Kernel & CPU type
    • and more…
 
 
 

Toolchain

  • Previously needed to use custom packages
  • Not needed since dpkg 1.18.11

Build paths

  • Many builds embed path they are built from
  • our test setup previously used same build path
  • In unstable, we now vary the build path → lots of unreproducibility
  • Work on general solutions ongoing (eg. SOURCE_PREFIX_MAP for GCC)

.buildinfo files (1/2)

  • Definition of the environment used during a build
  • Specificies:
    • Inputs: .dsc, Build-Depends, build path, etc.
    • Outputs: .deb checksums, etc.
  • dak needs to accept and forward them
  • Will be signed by autobuilders, third-parties, etc.
  • Experimental server at buildinfo.debian.net

.buildinfo files (2/2)

  • Outstanding questions:
    • Filename scheme
    • Time
    • Environment variables
  • How to host 100,000s of files on ftp-master.d.o

2016 summit meeting

  • Three-day workshop in Berlin, Germany
  • Follow-up to Athens 2015 event


reproducible-builds.org/events/berlin2016/

Beyond Debian…

  • coreboot, Fedora, LEDE, OpenWRT, NetBSD, FreeBSD, Arch, Qubes, F-Droid, NixOS, Guix, etc.
  • Other projects now using "our" testing framework, SOURCE_DATE_EPOCH, .buildinfo file concept
  • Reproducible Builds summits (Athens, Berlin)
  • Some challenges moving from debian- prefixes, mailing lists, etc.
  • Generic tools
 

Future work

  • dak (.buildinfo file support)
  • How to make it meaningful for end-users
  • Source code still vulnerable

Getting involved




Questions?



lamby@debian.org C2FE 4BD2 71C1 39B8 6C53 3E46 1E95 3E27 D431 1E58
holger@debian.org B8BF 5413 7B09 D35C F026 FE9D 091A B856 069A AA1C

reproducible-builds.org
#reproducible-builds on irc.oftc.net