Problem
MLX uses one default MLX_VULKAN_MATMUL_SPEC tuple unless manually overridden by env var. ggml has a much richer architecture/tuning matrix including vendor detection, subgroup characteristics, cooperative matrix probing, and many precompiled matmul/FA variants.
Tasks
Related
See Tier 2 item 1 in performance analysis report.