The recent glibc 2.33 version recently available in Tumbleweed includes this change:
The dynamic linker loads optimized implementations of shared objects
from subdirectories under the glibc-hwcaps directory on the library
search path if the system's capabilities meet the requirements for
that subdirectory. Initially supported subdirectories include
"power9" and "power10" for the powerpc64le-linux-gnu architecture,
"z13", "z14", "z15" for s390x-linux-gnu, and "x86-64-v2", "x86-64-v3",
"x86-64-v4" for x86_64-linux-gnu. In the x86_64-linux-gnu case, the
subdirectory names correspond to the vendor-independent x86-64
microarchitecture levels defined in the x86-64 psABI supplement.
ʳᵉˡᵉᵃˢᵉ ⁿᵒᵗᵉ
This means we can now build different optimized libraries for each of those microarchitectures and have the dynamic linker use them if the host cpu supports the optimizations.
There are currently 4 levels of optimizations:
- baseline (the current default)
- CMOV
- CX8
- FPU
- FXSR
- MMX
- OSFXSR
- SCE
- SSE
- SSE2
- x86-64-v2
- CMPXCHG16
- LAHF-SAHF
- POPCNT
- SSE3
- SSE4-1
- SSE4-2
- SSSE3
- x86-64-v3
- AVX
- AVX2
- BMI1
- BMI2
- F16C
- FMA
- LZCNT
- MOVBE
- OSXSAVE
- x86-64-v4
- AVX512F
- AVX512BW
- AVX512CD
- AVX512DQ
- AVX512VL
The idea for hackweek would be to first build a couple of optimized libraries manually, put them on those directories and test that it actually works.
Then work on providing rpm macros and some documentation to make it easy to build different flavors of libraries with -march=x86-64-v2/v3/v4
, install them in the right locations and get subpackages generated.
For example, the libfoo1 package would have libfoo1-x86-64-v2, libfoo1-x86-64-v3 and libfoo1-x86-64-v4 subpackages with only the respective optimized libraries in their filelist and they would use Supplements: packageand(libfoo1:x86-64-v3)
so a user could install a x86-64-v3 package (name TBD) and get the optimized flavor for that microarchitecture installed automatically for installed libraries.
This would hopefully get a performance benefit in openSUSE Tumbleweed (and in SLE/Leap once they include the new glibc version).
After a quick talk with Florian Weimer (from glibc/Red Hat) who proposed a better and less intrusive approach, the plan (after the manual test mentioned above) is to:
Hack gcc to add an option to keep the GIMPLE bytecode when linking (
-ffat-lto-objects
might work for this but I'd need to test that).Hack gcc to add an option to "relink" (or "reoptimize") an existing library/executable using its embedded GIMPLE bytecode and generate a new library/executable optimized for a given microarchitecture.
Check how to use objcopy to strip the embedded GIMPLE bytecode from the original library/executable after everything is finished.
Provide rpm macros to generate the mentioned subpackages and scripts in /usr/lib/rpm that would be run after the package is built in order to relink libraries for all microarchitectures with the new added options without having to rebuild the whole package several times.
This will be more difficult than expected since I don't have much experience with gcc's internals, but I guess that's what hackweeks are for :)
Results
There's a report with the results of this hackweek project here:
https://antlarr.io/2021/03/hackweek-20-glibc-hwcaps-in-opensuse/
and an explanation of the new rpm macros that were created for this, here:
Looking for hackers with the skills:
Nothing? Add some keywords!
This project is part of:
Hack Week 20
Activity
Comments
-
almost 4 years ago by dancermak | Reply
This sounds very intriguing! I have a few notes about this:
- you might be interested in this (sadly stalled) upstream PR: https://github.com/rpm-software-management/rpm/pull/1035 which adds better detection of the currently running microarchitecture
- once rpm gains the ability to automatically generate subpackages (https://github.com/rpm-software-management/rpm/pull/1485), this could be completely automated
- I would suggest to use actual boolean dependencies instead of
packageand
:Supplements: (libfoo1 and x86-64-v3)
And please, please make some noise about this and coordinate it with the other rpm based distros, so that we don't end up with yet another SUSE-ism but instead lead the innovation.
-
almost 4 years ago by dfaggioli | Reply
Wow... This looks very interesting! I'm not really well versed in any of the technologies involved but, as soon as you have a library or to ready, I'd be happy to run benchmarks (w.g., with MMTests) to try to assess the differences (== improvements, hopefully )
Similar Projects
This project is one of its kind!