-
Notifications
You must be signed in to change notification settings - Fork 12.9k
HIP: Cleanup hipification header #15285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
looks like this dosent compile against older rocm, dont commit. |
@JohannesGaessler im not a huge fan of the changes in 97392a5 but i see no better option, the other options we have are:
|
Switch over to hip_bf16 from legacy hip_bfloat16 Simplify RDNA3 define Reduce swap over of new hipblas api to rocm 6.5 as this version is used for rocm 7.0 previews
I agree that having some extra function for type conversions is very annoying. Unfortunately with CUDA this is already necessary on master due to FP16 <-> BF16 conversions being ambiguous. I pushed a version that consolidates the other code that needs to handle these cases so that, going forward, we need to maintain only a single version. I also simplified the code a bit: I changed the name to One more question: is there a reason why you declared the function as |
I had the template parameter order that way around because this is what all other functions in ggml do, but this way is fine with me. The rest of the changes are fine with me. While nothing currently uses this on the host side, i see no reason to restrict it to device code, indeed if its just the inline is obviously just there because its header implemented, i have no objections to makeing it forceinline, but if the compiler is not inlineing these functions automatically i dont know what to tell you. |
I think |
|
Switch over to hip_bf16 from legacy hip_bfloat16
Simplify RDNA3 define
Reduce swap over of new hipblas api to rocm 6.5 as this version is used for rocm 7.0 previews