-
Notifications
You must be signed in to change notification settings - Fork 0
[RISC-V] Extend support for RVV floating-point kernels #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Would you still have the numbers laying around for the using |
Yeah, we switched from a higher LMUL in favor of a lower one to cater better for the cache cold case. We'll share the numbers for the kernels (with the LMUL and unrolling permutations) for both cache hot and cold. |
As discussed on the call, the way we should chose is:
We should also pick what's currently the better number on BananaPi, and not optimize for a ideal hardware (out-of-order, more vector ports, etc.). For better or for worst, BananaPi is what is currently commercially available, so it's the fairer target to have for RISE (no preferential treatment of some microarchitectures). Once there is a better hardware broadly commercially available, then we will want to do another pass of optimizations. |
Sure, makes sense. We'll make the changes. |
|
I see these are still doing accumulations to fp32 using __riscv_vfwmaccbf16_vv_f32m4 but that is what is called for in the function prototype
ggml_vec_dot_bf16(int n, float * GGML_RESTRICT s, …
so no complaints here.
Actually no complaints with any part of the pull request. Full speed ahead!
Thanks!!
Dave
From: Ludovic Henry ***@***.***>
Sent: Thursday, November 13, 2025 7:05 AM
To: riseproject-dev/llama.cpp ***@***.***>
Cc: David Baker ***@***.***>; Review requested ***@***.***>
Subject: Re: [riseproject-dev/llama.cpp] [RISC-V] Extend support for RVV floating-point kernels (PR #1)
@luhenry<https://github.com/luhenry> requested your review on: #1<#1> [RISC-V] Extend support for RVV floating-point kernels.
—
Reply to this email directly, view it on GitHub<#1 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BHGNESYEV3N6ELSQWGSZRX334R6XVAVCNFSM6AAAAACL3KOR26VHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMRQHEYTOMBTGUZDGOI>.
You are receiving this because your review was requested.Message ID: ***@***.******@***.***>>
|
e646298 to
ffbab18
Compare
|
We've made the changes as discussed yesterday. Good to go from our end. |
|
@taimur-10x can you please rebase before merging, or should it be squashed. Thank you! |
|
@taimur-10x I asked for a rebase, not merging master. A |
e3be2ff to
da60598
Compare
Co-authored-by: Rehan Qasim <rehan.qasim@10xengineers.ai>
be4fa97 to
acf3e4f
Compare
|
@ludovic, rebased and squashed where required. Should I merge this in |
|
@taimur-10x I've created a |
|
Opened here: ggml-org#17318 |
This PR extends the existing RISC-V Vector (RVV) floating-point support introduced introduced in (PR# 15075), adding new kernels.
Summary
BF16RVV Flag toggml-cpu/CMakeLists.txtto enable thezvfbfwmaextensionNewly Added Kernels
Testing
Kernels were functionally tested on QEMU for VLENs (128-bit, 256-bit, 512-bit and 1024-bit) for a range of input sizes.
For RISE
Additional Notes
The testing and benchmarking files will be shared in a subsequent PR:
test-float-fns: Functional Testing of Floating-Point Kernelstest-float-perf: Performance Benchmarking of Floating-Point KernelsBenchmarks
Benchmark results on BananaPI-BPI F3 (VLEN=256)
Cache Hot
ggml_vec_mad_f16ggml_vec_scale_f16ggml_vec_dot_f16_unrollggml_vec_silu_f32ggml_cpu_fp16_to_fp32We do not have the hardware to benchmark
bf16kernels:Kernel Benchmarking
vec_dot_f16
vec_dot_f1622Yes1917291vec_mad_f16
vec_mad_f1642Yes2250375vec_scale_f16
vec_scale_f1642Yes1500291cpu_f16_to_fp32
cpu_f16_to_f3222No2125416vec_dot_f16_unroll
vec_dot_f16_unroll22No3875458vec_silu_f32
vec_silu_f322--70416083