Problem
MLX caches pipelines in-process in KernelManager, but there is no VkPipelineCache object, no persistence, and no warmup path. ggml prebuilds many more variants, reducing "compile when first hit" stalls.
Tasks
Related
See Tier 2 items in performance analysis report.