WHIZARD: Build and link time comparison
BFD, Gold or LLD
Someday before, I visited my favorite (hacker) news site, and by chance, I stumbled upon an at-first-glance interesting linker project: https://github.com/rui314/mold. Especially, this linker project made me curious about the other linkers in production:
- ld.bfd and ld.gold are part of the GNU binutils,
- ld.lld is the currently fastest linker providen by LLVM.
The ld.bfd is the default linker of the GNU toolchain, ld.gold was its supposed-to-be successor.1
However, it seems that GNU gold lost its favorite position to LLVM's ld.lld.
The details of their connection to each other and their history is interesting, but of no further interest for us.
I wanted to assess the build time difference for WHIZARD between the three different linkers in usage: default, gold or lld.
Although I wouldn't expect a large deviation between the build times as WHIZARD is mostly written in modern Fortran.
I use hyperfine as benchmarking tool as it provides a neat set of features, i.e. statistical anlysis, pre- and post-invocations, for example.
In order to facilitate possible interference between the benchmarks and my user activity - I am doing the benchmarks during my productive time - I scan over the number of physical cores of my laptop.
In principle, if there is a difference between the linker time for the WHIZARD build, the difference should be (mostly) independent of the cores used for the build.
We invoke GCC with a different linker with -fuse-ld={bfd|gold|lld}, https://gcc.gnu.org/onlinedocs/gcc-4.9.1/gcc/Optimize-Options.html#Optimize-Options.
Thus, we need to convey the flag across WHIZARD's build system, Autotools.
And, that is a problem.
[Longish paragraph why I failed to invoke GCC with above flag.]
The upshot of above (abbreviated) paragraph is: Autotools and libtool perform quite a lot of magic to assess the right compile and link flags, i.e. library and linker flags.
Especially, both need to understand howto interpret and forward -fuse-ld to the link-invocation of the compiler.
I performed a quick search into the Autotools documentation, and also peeked at WHIZARD's m4/libtool.m4, and found out that configure tries to guess whether the system provides the GNU BFD linker (or gold), and set the variable LD accordingly.
I tried different combinations of GCC flags, LD environment variable and configure flags.
A short list of my trials:
- Adding
-fuse-ld=lldto CFLAGS - Adding
-fuse-ld=lldto CFLAGS and-with-gnu-ld - Adding
-fuse-ld=lldandLD=ld.lld - Adding
-fuse-ld=lld,--with-gnu-ldandLD=ld.lld - Adding
-fuse-ld=lldto CFLAGS and FCFLAGS - …
In the end, in neither of my trials, libtool recognized the -fuse-ld=ldd option (it works with bfd and gold, obviously?).
Although, I am not sure whether libtool is at fault, or the libtool.m4, which is used by Autotools to generate the correct invocation of libtool.
However, the -fuse-ld=lld flag is correctly inserted (by $FCFLAG), but again, is not forwarded by libtool to the link invocation of gfortran.
/bin/sh ../../libtool --tag=FC --mode=link gfortran -O0 -g -fuse-ld=lld -o libcirce1.la -rpath /home/sbrass/whizard/master-debug/install/lib circe1.lo -ldl libtool: link: gfortran -shared -fPIC .libs/circe1.o -ldl -O0 -g -Wl,-soname -Wl,libcirce1.so.0 -o .libs/libcirce1.so.0.0.0
The LLD developer are genius - actually, they have the same problem2 - and hint at either using the -fuse-ld flag (as I did) or by linking ld against ld.lld.
And - this is not a joke - it works. It ain't stupid, if it works.
We can verify that everything works as expected with readelf --string-dump .comment <file>.3
With new found knowledge, I glue together a short bash scripts which facilitates the linking of which-ever ld I want to use in a temporary directory and adds the temporary directory to PATH.
help() { ## Benchmark with hyperfine a complete WHIZARD make build excluding configure on a clean directory structure. sed -n 's/^\s*##//p' "${BASH_SOURCE[0]}" >&2 exit 1 } USE_LD="bfd" ## Arguments: # https://wiki.bash-hackers.org/howto/getopts_tutorial while getopts ":hl:" opt; do case $opt in ## -l :: choice of linker [default: bfd] (see below) h) help ;; ## -h :: print help message and exit l) USE_LD="${OPTARG}" ;; \?) echo "Invalid option: -$OPTARG." >&2 exit 1 ;; :) echo "Option -$OPTARG requires an argument." >&2 exit 1 ;; esac done ## Valid options for -l: case ${USE_LD} in ## - bfd :: default GNU linker "bfd") LD=ld ;; ## - gold :: alternative GNU gold linker "gold") LD=ld.gold ;; ## - lld :: LLVM's linker "lld") LD=ld.lld ;; *) if test -z "${USE_LD}"; then echo "No linker option chosen." >&2 else echo "Invalid option ${USE_LD}." >&2 fi exit 1 ;; esac TEMP_PATH="$(mktemp -d)" LD_PATH="$(which ${LD})" (set -x ln -s "${LD_PATH}" "${TEMP_PATH}/ld" ) PATH="${TEMP_PATH}:${PATH}" hyperfine \ --export-markdown "../bench-${USE_LD}.md" \ --prepare '../development-master/configure FCFLAGS="-O0 -g"' \ --cleanup 'make distclean' \ --runs 3 \ --parameter-scan threads 1 4 'make -j {threads}' rm -rf "${TEMP_PATH}"
And, within the errors reported by hyperfine, there is no advantage in using either of the other linkers beside GNU's default linker with regard to the runtime.
| Command | Mean / s | Min / s | Max / s | Relative / % |
|---|---|---|---|---|
make -j 1 |
298.543 ± 4.714 | 293.333 | 302.514 | 1.08 ± 0.30 |
make -j 2 |
292.847 ± 89.613 | 240.676 | 396.322 | 1.06 ± 0.44 |
make -j 3 |
276.186 ± 77.157 | 228.825 | 365.218 | 1.00 |
make -j 4 |
278.080 ± 73.908 | 234.997 | 363.420 | 1.01 ± 0.39 |
| Command | Mean / s | Min / s | Max / s | Relative / % |
|---|---|---|---|---|
make -j 1 |
541.570 ± 122.198 | 464.746 | 682.480 | 1.93 ± 0.63 |
make -j 2 |
294.014 ± 83.005 | 242.146 | 389.748 | 1.05 ± 0.38 |
make -j 3 |
286.866 ± 82.224 | 239.011 | 381.810 | 1.02 ± 0.38 |
make -j 4 |
280.934 ± 65.941 | 233.332 | 356.200 | 1.00 |
| Command | Mean / s | Min / s | Max / s | Relative / % |
|---|---|---|---|---|
make -j 1 |
498.509 ± 142.891 | 401.433 | 662.589 | 1.82 ± 0.70 |
make -j 2 |
412.006 ± 124.456 | 332.525 | 555.435 | 1.50 ± 0.60 |
make -j 3 |
278.392 ± 80.875 | 230.958 | 371.774 | 1.02 ± 0.39 |
make -j 4 |
273.788 ± 70.320 | 232.909 | 354.986 | 1.00 |
Bear with me, the above results lack a meaningful rounding of the errors - it is not directly supported by hyperfine.
But, hyperfine allows us to export the results as a JSON-file.
Therefore, we could parse the JSON-file with Python and evaluate the numbers with the uncertainties module to automate the rounding.
However, these numbers are not really meaningful at all: I did not use a clean benchmark system, so these number and my conclusion should be seen with a huge grain of salt at all.
Footnotes:
Gold (linker), https://en.wikipedia.org/w/index.php?title=Gold_(linker)&oldid=1005327625 (last visited Feb. 24, 2021).