Problem
dispatch_with_spec() allocates/reuses a descriptor set, writes descriptors with vkUpdateDescriptorSets, binds it, dispatches, then defers reclamation by epoch. This high-frequency driver work hurts small kernels, fused graphs, and attention-heavy workloads.
Reference
ggml-vulkan is much more aggressive about bulk descriptor-set handling and queue-side reuse patterns.
Tasks
Related
See Tier 1 item 4 in performance analysis report.