Collaborative Working Sessions - Filtering diffoscope output
Goal: add patterns to filter out some parts of output, or filters to only show some parts of output
Requirements:
print info that parts output are being ignored
indicate in return code that files are not identical
A number of options exist:
--exclude
--exclude-command=REGEXP: this skips command matching REGEXP
(--exclude-command '^readelf.*gdb_index')
but then diffoscope tries the next command, possibly falling back to hexdump comparison
output formats: --json, --html, --htmldir.
Multiple output formats can be use together.
--load-existing-diff FILE.
Diffoscope will produce all kinds of output from JSON.
This can be combined with ‘jq’ filtering or some other way to filter.
Internally, state is a series of deeply nested dictionaries.
The comparator is called with a paths of keys.
Issues about –exclude* already exist:
https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/130
https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/53
https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/52
Filtering by “output level” is not enough.
For example, in an RPM header, some specific fields should be ignored, but only those.
Idea: provide a command to filter the output using a jq-like path.