ggml : remove KQ mask padding #16309

ggerganov · 2025-09-28T15:21:34Z

Gauging what would it take to remove the KQ mask padding along the batch dimension (ne31). Removing this padding would simplify the graph building logic and will reduce the amount of memory that we allocate and transfer for KQ masks.

Metal (after 46c338f)
CUDA ?
Vulkan (after vulkan: in flash attention, bounds check against nem1 (don't rely on GGML_KQ_MASK_PAD) #16316)

jeffbolznv · 2025-09-28T16:02:20Z

This will require some more changes to the Vulkan backend.

jeffbolznv · 2025-09-28T18:51:21Z

#16316 makes Vulkan handle this.

slaren · 2025-09-28T18:56:19Z

Wouldn't this cause the tensor shape to change in every evaluation, and break graph reuse and CUDA graphs?

ggerganov · 2025-09-28T19:12:14Z

Wouldn't this cause the tensor shape to change in every evaluation, and break graph reuse and CUDA graphs?

It shouldn't - this is the padding along the batch dimension (src[3]->ne[1]). The padding along the context dimension (src[3]->ne[0]) is relevant for having constant graph shapes. It will remain.

ggml : remove KQ mask padding

837b1b4

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Sep 28, 2025

jeffbolznv mentioned this pull request Sep 28, 2025

vulkan: in flash attention, bounds check against nem1 (don't rely on GGML_KQ_MASK_PAD) #16316

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml : remove KQ mask padding #16309

ggml : remove KQ mask padding #16309

ggerganov commented Sep 28, 2025 •

edited

Loading

Uh oh!

jeffbolznv commented Sep 28, 2025

Uh oh!

jeffbolznv commented Sep 28, 2025

Uh oh!

slaren commented Sep 28, 2025

Uh oh!

ggerganov commented Sep 28, 2025

Uh oh!

Uh oh!

ggml : remove KQ mask padding #16309

Are you sure you want to change the base?

ggml : remove KQ mask padding #16309

Conversation

ggerganov commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeffbolznv commented Sep 28, 2025

Uh oh!

jeffbolznv commented Sep 28, 2025

Uh oh!

slaren commented Sep 28, 2025

Uh oh!

ggerganov commented Sep 28, 2025

Uh oh!

Uh oh!

ggerganov commented Sep 28, 2025 •

edited

Loading