The locale of the build system might affect the build products. While it is important that developers have access to error messages in the language of their choice, tools which output is influenced by the current locale can make locale a source of reproducibility issues.
There are many aspects regarding locales (see GNU libc locale(1) manpage). The ones that follow are the most important ones to consider in the context of reproducible builds.
Several common time formatting functions will have output depending
on the current locale. On a POSIX system the formatting will depend on
LC_CTIME environment variable, which can be overridden by
For build systems, it’s thus best to use
The system timezone
TZ environment variable will also affect the output of time
Common sorting functions are affected by the
variable, which can be overridden by
LC_ALL. Some locales can
be quite surprising.
This typically shows when using
fr_FR locale will sort
independently of the character case:
C locale will sort according to the byte values and is always
Default character encoding
The default system character encoding will affect both the input and
output of many tools. It is defined using the
variable, and can also be overridden using
Here’s an example when using
lynx to convert HTML documentation into
C.UTF-8 pseudo-locale can always be used to get the default strings with