-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Checklist
- 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/kvcache-ai/ktransformers/discussions. Otherwise, it will be closed.
- 2. To help the community, I will use Chinese/English or attach an Chinese/English translation if using another language. Non-English/Chinese content without translation may be closed.
Motivation
To accelerate the prefill speed of AMD. Reference the https://github.com/amd/blis repo. And the usage should add the LPGEMM support. See the docs here: https://www.cs.utexas.edu/~flame/BLISRetreat2024/slides/Bhaskar_BLIS_Retreat_2024_AMD_LPGEMM_0.pdf
I reference this api guide for the code: https://docs.amd.com/r/en-US/57404-AOCL-user-guide/AOCL-BLAS?section=lpgemm-in-aocl-blas
To use lpgemm, see the doc here:
https://www.amd.com/content/dam/amd/en/documents/developer/version-4-1-documents/aocl/aocl-4-1-user-guide.pdf

So, you just need to enable aocl_gemm add-on, examples are here:https://github.com/amd/blis/blob/master/docs/CMakeBuildSystem.md
You can see how to install it.
Related resources
PR: #1600
No response
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request