Collaborative Working Sessions - Images, filesystems and containers
https://reproducible-builds.org/docs/system-images/
Filesystems
- ext4 reproducibility mkfs.ext4 is not reproducible (because allocation of the inodes is undefined) make_ext4fs works, but is unmaintained
- ext4 creation time ends up in headers
- UUIDs need to be seeded
- there is patches on rb ML + setting the env up allows making ext4 reproducible (with mkfs.ext4?)
- read-only filesystems (squashfs, erofs)
- btrfs?
How to reproduce a full image
- need a snapshot service (containing package versions)
- need to record sufficient information every single package (in the correct version) config, version for tools used, generate manifest or read from original images
- order of packages in dpkg database apparently there is a flag to tell apt to (re)order
- same kernel
Random problems/ideas
- Upgrading a single package on a given image (using a ro FS) can scramble the image quiet a bit (probably time stamp issues?)
- initrd (timestamps or ordering issues) dracut: more likely to work with SDE mkinitcpio/mkinitramfs: ?
- website: mention “magic” variables
- package installation needs to be reproducible
- exim4 postinst puts hostnames into some config
- Packages.xz get cached (and rebuilt on Debian)
- /etc/apt/sources.list would be different when using a snapshot service)
- /etc/passwd /etc/shadow order
- dependency on host kernel through /proc, /dev, FS code, (fs related) kernel config options may need to built images in a VM with a fixed kernel ?!
- mkfs.* can introduce dependency on the host system
- pycache differences (*.pyc files) Debian does not ship bytecode, other distros do and stripping them down would slow things down
- Priority: important/optional ?! this actually comes from the source package (so no idea how/why this could change)
- diffoscope can be told to exclude timestamps
Container
container images are just tarballs (something something OSI image) (note: we didn’t talk about container images too much)