Collaborative Working Sessions - Towards a snapshot service

binary archives:

  • Debian snapshot.debian.org - slow and unstable
  • Arch (daily snapshot)
  • Notalpine
  • openSUSE (daily snapshot)

source archives

openwrt needs source tarballs with specific hash others are mostly interested in latest sources + older binaries

use-cases:

  • verify latest binaries
  • track down supply-chain dependency problems

Arch: sends a month worth to internet archive, keeps index

openSUSE: keeps archive of published x86_64 binaries (some unpublished build deps missing) in IPFS on two machines on a 16TB HDD

Software heritage keeps sources - only git?

pristine-tar could help to track tarballs in git

Debian: Vagrant did some more work on capturing current deps

Need index by SHA-sum snapshot.debian.org is fast in delivering SHA-sum

Packages list includes SHA-sum for all packages. buildinfo only lists name+version but not SHA-sum, because dpkg-build does not have hashes.

Frederic had a copy of snapshot.debian.org ; but operational problems

metasnap FIXME

build-time from buildinfo file can tell what snapshot to use.

Need DB of name+version => SHA-sum

Debian build-env may be partially outdated at time of build. Makes it harder to find the right versions.

Is it possible to make snapshot.debian.org faster? Uses FUSE filesystem; uses SHA1 internally while Debian uses MD5+SHA256 so mapping needs effort 100TB archive; 80 GB per snapshot ; 1M files need only a small subset that is used for builds. Also needed for reproducing images.

need more new faster servers? With distributed indexed servers.

Need URL that gives a specific repo state at a time.

Fedora does not do snapshots, but koji API to fetch past name+version ; not sure how long it is kept.

Qubes has few Debian packages ; one repo with latest versions ; another repo will all old versions ; scales OK there.