How bootstrapping relates to reproducible builds and how to improve it
- https://bootstrappable.org
- bootstrappable builds
open questions:
- What is the relationship to reproducibility?
- Why should we care?
- We need to trust binaries used during build. If build binaries are not trustworthy, this makes build results less trustworthy.
- If we have a exe at the top and a lib it depends on, do we call the exe reproducible even if the lib is not reproducible?
- is trust binary / black&white? no…
overlap:
- build as much from source to need less trust in binaries, and to be able to read/review the source
- increase trust in the software we build
- trust gets added with every way you can build a software from source
- diff: reproducible builds: done when 100% of packages are building reproducibly
- bootstrappable: (C-part) done when one can take any C-compiler and compile the production C-compiler and with that get bit-identical binaries of anything
Requirement for doing bootstrappable builds:
- A “seed set” of bootstrap binaries has to be declared by a project (e.g. see f-droid below or a binary with a specific checksum), must not be implicit (e.g. previous version of myself)
- For build use a limited build environment containing only those binaries and nothing more.
- large complex source can still be hard to read, understand, verify => reduce amount of trusted software backdoors in built source code are out of scope of bootstrappable builds
- Is it a software freedom problem? maybe, maybe not
- long chains of A->B->C may bitrot and break over time, give less trust than A->C ; but we have archives, and containers
- obsolete hardware needs to be emulated and the emulator becomes part of the binaries we need to trust.
- Have build scripts that are fully specified about versions of compilers, etc to use in a boostrap-chain.
Note: Trust is not transitive (unlike a=b=c meaning a=c) so if the sister of a friend knows someone who verified this it is not as much trust as “I verified this”. Possibly also because trusting someone very much translates to a factor of 0.9x thus for every level of indirection you lose some trust.
f-droid: using debian binaries as much as possible because they are built from source and thus more trustworthy.
guix: build archive with checksums of everything with 218MB bootstrap binaries
openSUSE: uses Ring-0
Goal: come up with very small set of auditable binaries+sources
- https://gitlab.com/janneke/mes is close
- https://savannah.nongnu.org/projects/stage0
Goal: need zero trust in the seed set of binaries - cannot be fully reached, but we can get to very small (maybe infinitessimal) values of trust needed.
How to distinguish trusted bootstrap binaries from other binaries? tools/compilers that depend on themselves:
- gradle
- ghc(Haskell)
- rust
- maven
identify important next steps:
- collect list of bootstrappable and non-bootstrappable tools
- convert notes to proper doc
- identify high-importance targets to achive bootstrappability
- more to be discussed