-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Support for SME1 based strmm_direct kernel for cblas_strmm level 3 API #5450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
fe3eb6c
to
c29f60c
Compare
@martin-frbg |
c29f60c
to
5d80985
Compare
Looks like the placement of your changes in setparam-ref.c is at odds with the location of the new symbols in the DYNAMIC_ARCH part of common_param.h |
5d80985
to
8173910
Compare
Hi Martin. Thanks for your quick review. It’s very helpful. But Three checks are still failing. |
The OSX_OpenMP_Clang_cmake and Windows_mingw_gmake jobs timing out (causing the overall AzureCI job to fail) is certainly unrelated to your change. The Windows on Arm (LLVM) linker appears to be seeing duplicate symbols between the LNUN and LNLN builds of your code - suggesting there is something wrong with code selection in the CMake build as
|
To resolve the duplicate symbol issue with _arm_tpidr2_restore and _arm_tpidr2_save when building for non-SME targets (e.g., ARMv8), I propose conditionally including sme_abi.h only when SME is supported: #if defined(__ARM_FEATURE_SME) && defined(clang) && (clang_major >= 16) This ensures that SME-specific ABI routines are only included when the target architecture and compiler support them, avoiding symbol conflicts in non-SME builds. Let me know if this approach looks good or if further adjustments are needed. |
8173910
to
ea2890d
Compare
@martin-frbg |
The failing ones are all unrelated to your changes (as I mentioned earlier - these are x86 jobs that end up running on slow hardware sometimes) |
Add implementation of strmm kernel based on the SME1 architecture.