JVM
The JVM ecosystem provides many languages (Java, Scala, Groovy, Kotlin, …) and build tools (Maven, Gradle, sbt, …).
The javac
compiler generates reproducible bytecode .class
output as do
most language-specific compilers, but JVM packaging (in .jar
files) is not
reproducible-friendly – particularly timestamp of files in the archive –,
each build tool requires some work mostly at packaging step to provide
Reproducible Builds.
Reproducible Central
Whatever the build tool is, binary JVM artifacts are generally published in artifact repositories that use the Maven2 repository format (using groupId/artifactId/version coordinates) like Maven Central or Google’s Android Repository.
Reproducible Central is an effort to rebuild public releases published to Maven Central and check that Reproducible Build can be achieved.
Contributions are welcome to write additional .buildspec
file
that will be used to rebuild the release and compare against binaries
available in Maven Central.
Configuring Build Tools for Reproducible Builds
Maven
Getting reproducible builds with Maven requires some plugins configuration: see Maven - Guide to Configuring for Reproducible Builds for more details.
Gradle
Gradle supports reproducible archives as of v3.4.
Tasks which generate archives, such as ZIPs or JARs, can enforce preserved file timestamps and reproducible file order which fix two of the main sources of non-determinism in JVM artifacts.
sbt
When using sbt, a build tool popular with Scala projects, you can use the sbt-reproducible-builds plugin to strip your artifacts and share buildinfo information.
.buildinfo
file
JVM .buildinfo
file format is a format drafted in 2018 when working
initially on Reproducible Builds for the JVM, to try to record full build
info data, from source and environment used to rebuild to output result: its
intent was to easily compare 2 builds run by independant people.
After 3 years of work on Reproducible Builds, it has been found more useful
as an internal file format: Reproducible Central and its .buildspec
format
is more what we need to check that Reproducible Builds results has been
achieved. .buildinfo
just records a build, be it reproducible or not.
Buildinfo file format version 1.0-SNAPSHOT is kept here for reference on past work. It uses Java properties format:
#### Work In Progress ####
buildinfo.version=1.0-SNAPSHOT
name=<name of the artifact>
group-id=<groupId coordinates in repository>
artifact-id=<artifactId coordinates in repository>
version=<version coordinates in repository>
# source information for rebuilders, as source tarball artifact in repository and/or url and/or scm coordinates
source.artifact=<groupId>:<artifactId>:<version>:<classifier>:<extension>
source.url=<url where to download official source tarball>
source.scm.uri=<source control uri, typically corresponding to the project.scm.developerConnection or project.scm.connection in the pom.xml>
source.scm.tag=<source control tag as in pom.xml>
# build instructions
build-tool=<mvn|sbt|...>
build.setup=<optional url of documentation explaining specific additional setup when necessary: will be enhanced in a future buildinfo format version>
# effective recorded build environment information
java.version=<Java version taken from "java.version" system property>
java.vendor=<Java vendor taken from "java.vendor" system property>
os.name=<Operating System name taken from "os.name" system property>
source.used=<artifact|url|scm, depending on which has been used for the build>
# Each build tool or plugin is free to add additional entries to the buildinfo,
# both for build instructions and effective recorded build environment.
# For example, the sbt plugin may add the following for Scala:
sbt.version=1.2.3
scala.version=2.12.6
# and Maven could add data on rebuild instructions and effective recorded environment:
mvn.rebuild-args=-Dmaven.test.skip package
mvn.build-root=<groupId>:<artifactId>:<version>
mvn.version=Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
mvn.minimum.version=<minimum Maven version to rebuild if known>
# A buildinfo file can contain checksums for multiple output files, for
# example for the main jar and the accompanying pom.xml (when generated):
outputs.0.filename=<file name in the repository, ${artifactId}-${version}[-${classifier}].${extension}>
outputs.0.length=<file size>
outputs.0.checksums.sha512=<sha512 lowercase>
outputs.1.filename=<file name in the repository>
outputs.1.length=<file size>
outputs.1.checksums.sha512=<sha512 lowercase>
...
Notice that ${artifactId}-${version}-sources.jar
files published in Maven
repositories are not buildable sources, but sources for
IDEs.
Source tarballs, intended for building, are not always published in
repositories but only sometimes, with 2 classical naming conventions: -
${artifactId}-${version}-source-release.zip
(see artifacts in Central
providing such source
tarballs) -
${artifactId}-${version}-src.zip
(see artifacts in Central providing such
source tarballs)
Introduction
Achieve deterministic builds
- Commandments of reproducible builds
- Variations in the build environment
- SOURCE_DATE_EPOCH
- Deterministic build systems
- Volatile inputs can disappear
- Stable order for inputs
- Stripping of unreproducible information
- Value initialization
- Version information
- Timestamps
- Timezones
- Locales
- Archive metadata
- Stable order for outputs
- Randomness
- Build path
- System images
- JVM
Define a build environment
- What's in a build environment?
- Recording the build environment
- Definition strategies
- Proprietary operating systems