User policies

NOTES #1

User Policy

The aim of the discussion was to work out the user policy we could put in front of an average user. Though we did have diversions into what a power user would want.

We realise that a power user may well want more complex tools - but if users need to make complex decisions as to suitible values for multiple factors they are likely to decide it is too dificult and not bother.

It is assumed that the user has made the decision that they only want reproducible builds. We assumed the machinery needed would be in its own package which the user could install We refer to this as the reproducible-only package - though the name is not determined. Perhaps in some future version this would be installed by default.

We assumed that for each distribution there would be a number of separate independent rebuilders. A re-builder would attempt to build new releases and determine whether they were truly reproducible. A re-builder would need to have daemon that noticed new releases and built them and published signed build info

We assumed that by default each distribution would publish a list of known re-builders. Builders will be identified by their public key. This list should be in the reproducible-only package.

The average user can simply accept this package. The power user can check identities against other published sources and may add a list of other known re-builders. The power user may choose to be their own re-builder.

Of these N known re-builder we then set a threshold K - (K is used in preference to M for as M and N can be hard to distinguish). K should not be set to equal N as then the loss of a single re-builder leads to a denial of service The default option is probably N/2.

A package with a given hash is deemed to be reproducible is there are at least K re-builders from N that build it with version and inputs and generate that hash. If K is greater than N/2 then it is impossbile for there to be two different inatallable binaries for a single set of inputs.

When a re-builder notices a new version of source,it will attempt to rebuild it with the current version of the tool chain. If the hash it generates matches the hash generated by the maintainer - all is good.] If the has differs, the re-builder checks that it is building with the same environment. The builder is assumed to be building with the latest - so if the distribution was built with more recent tools something has gone wrong. It the original build was built with older tools it will try to build it with that tool chain.

If the hash now match - we have found a dependency on the tool chain. This could be due to a vulnerability in the old version, or just to an improvement introduced in the new one. The standard tools to cause an update should be triggered (build-mnu on debian)

The re-builder is permitted and encouraged to publish both signed buildinfos.

If the re-builder builds the same package twice with the same tool chain and the hashes do not match, it should publish both signed buildinfos This indicates that the build is not reproducible. The second version acts as the revocation of the first. The re-builder should inform the maintainer that there is a failure.

When the user wants to install a binary package.
For the purpose of this discussion a package is the smallest intallable unit.
A single collection of source may produce different binary packages.
The user wants to know the one they are installing is reproducilbe.
They do not need to worry about the reproducibility of other outputs.
For example it may be that the standard version is reproducilbe but the debug version isn’t.
Unless the user needs debug - this should not be an issue for them.
It downloads the binary package and checks the hash.
It then looks for all published signed buildinfos.
Find those for this package and version.
Set the count to zero
For each builder
If the builder is not in the known builder list
- Ignore them
- This means that an attacker cannot influence the process by publishing there own signed buildinfos as they will be ignored
If there is a single signed buildinfo and it matches this hash
- Increment the count
If there are multiple signed buildinfos
- Ignore buildinfos that do not match the one specified for the package to be installed
  - If there is only left increment the count
  - If there are still multiple buildinfos ignore it
If the count is greater than the threshold the build is deemed reproducible and should be installed

It is possible that the user will install a version that builds to a different version with a newer tool chain. This should cause the package to be updated, and the new version will be installed once it is released. This is equivalent to installing a package with a known CVE. The installer may display a warning.

If the count does not reach the threshold - warn the user the build is not reproducible and do not install it

Note for most users this process should be transparent. Although there is an extra check, it should be straightforward. It does not need to generate any extra log messages.

We expect it to only generate log messages if there is a failure - which should be rare.

Value of K

In the discussion we assumed the default would be N/2 However, a lax implementation may decide that a low number,e.g 2 or 3 is sufficient to demonstrate that the build is reproducible. The value depends how paranoid you are that the builders are not independent.

Rebuilders will take time to issue their signed buildinfos. The higher the number, the longer the wait before sufficient signatures have been issued.

NOTES #2

Simplest possible thing: the distribution only publishes packages that meet its own internal definition of reproducibility (e.g., it builds the same way on three different build servers)

Baseline more nuanced policy: the user designates builders they’re willing to trust (initially seeded from the distro), numbered N, and requires that K of those N builders achieve the same result before they’re willing to install the package.

Key inputs considered from the builder’s buildinfo: the pair of binary package+version, and the inputs used from it

Complications:

What happens if a builder can’t reproduce their own result? Should probably be equivalent to revocation.
Do builders have to build with exactly the library versions used in the published package? For current Debian infrastructure this might require downgrading security updates.

Questions:

Is there a privileged buildinfo, i.e., one that other builders are expected to match? Is this the distro’s own buildinfo?
If builders see a situation where most users would run the package in a configuration that’s different than what the archive built (e.g., because of a library security upgrade), should that be sent as a signal back upstream to trigger a rebuild of the package?
Can power users build the package themselves and use that to help satisfy K, or override completely?
How should we communicate that there are dissenting results when K agree?
What happens if a package stops meeting policy after it has been installed previously?
How many builders can we provide per architecture?

User policies Post-It notes