Does specifying hostPlatform.gcc.arch when importing Nixpkgs yield binaries which are more/less performant?
Some packages are smart enough to compile code for features not exposed by the host CPU (like OpenSSL and accelerated hash functions) and use runtime detection/dispatch, but not all do this. (Aside: do we enable GCC’s function multiversioning?)
Investigate the closures of some of the following to test for performance differences:
- nix
- OpenCV (are image registration routines compiled with support for avx256/512, for example?)
- PyTorch (would non-accelerator inference be faster?)